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Abstract 


Artificial  Intelligence  reasoning  systems  commonly  contain  a  large  corpus  of  declarative 
knowledge,  called  a  knowledge  base  (KB),  and  provide  facilities  with  which  the  system’s  com¬ 
ponents  can  retrieve  this  knowledge.  This  thesis  sets  out  to  study  the  very  nature  of  retrieval. 
Formal  specifications  that  capture  certain  informal  intuitions  about  retrieval  are  developed, 
studied,  and  implemented  by  retrieval  algorithms. 

Consistent  with  the  necessity  for  fast  retrieval  is  the  guiding  intuition  that  a  retriever  is, 
at  least  in  simple  cases,  a  pattern  matcher,  though  in  more  complex  cases  it  may  perform 
selected  inferences  such  as  property  inheritance. 

Seemingly  at  odds  with  this  intuition,  this  thesis  views  the  entire  process  of  retrieval  as  a 
form  of  inference  and  hence  the  KB  as  a  representation,  not  merely  a  data  structure.  A  retri¬ 
ever  makes  a  limited  attempt  to  prove  that  a  queried  sentence  is  a  logical  consequence  of  the 
KB.  When  constrained  by  the  no-chaining  restriction,  inference  becomes  indistinguishable  from 
pattern-matching.  Imagining  the  KB  divided  into  quanta,  a  retriever  that  respects  this  restric¬ 
tion  cannot  combine  two  quanta  in  order  to  derive  a  third. 

The  techniques  of  model  theory  are  adapted  to  build  non-procedural  specifications  of 
retrievability  relations,  which  determine  what  sentences  are  retrievable  from  what  KB’s. 
Model-theoretic  specifications  are  presented  for  four  retrievers,  each  extending  the  capabilities 
of  the  previous  one.  Each  is  accompanied  by  a  rigorous  investigation  into  its  properties,  and  a 
presentation  of  an  efficient,  terminating  algorithm  that  provably  meets  the  specification. 

The  first  retriever,  which  operates  on  a  propositional  language,  handles  only  yes/no 
queries,  the  second  also  handles  wh-queries,  and  the  third  allows  quantifiers  in  the  KB  and  the 
query.  Each  is  shown  to  be,  in  some  sense,  the  strongest  retriever  that  meets  the  no-chaining 
restriction. 

The  third  retriever  forms  an  excellent  basis  for  integrating  a  specialized  set  of  inferences 
that  chain  in  a  controllable  manner.  This  is  achieved  by  incorporating  taxonomic  inference, 
such  as  inheritance,  to  form  the  fourth  retriever,  an  idealized  version  of  the  retriever  incor¬ 
porated  in  the  ARGOT  natural  language  dialogue  system.  It  is  characterized  by  its  ability  to 
infer  all  consequences  of  its  taxonomic  knowledge  without  otherwise  chaining. 
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Chapter  1 


Introduction 

One  truism  of  artificial  intelligence  is  that  an  intelligent  system  must  have  vast  amounts 
of  knowledge  of  its  domain.  Some  of  this  knowledge  appears  to  be  maintained  independently  of 
its  use.  For  example,  I  know  that  certain  English  words  are  considered  rude.  I  can  use  this 
knowledge  in  achieving  my  goals,  thereby  behaving  differently  when  I  choose  to  insult  from 
when  I  choose  to  be  polite.  I  can  also  use  this  knowledge  to  recognize  the  intentions  of  other 
speakers.  Furthermore,  I  can  verbalize  this  knowledge  in  order  to  inform  a  non-native  speaker 
of  our  conventions.  Knowledge  of  this  nature  is  often  called  "declarative,"  though  the  word  is 
somewhat  ill-defined. 

Accordingly,  artificial  intelligence  reasoning  systems  commonly  contain  a  large  corpus  of 
declarative  knowledge,  called  a  knowledge  base  (KB),  and  provide  facilities  with  which  the 
system’s  components  can  retrieve  this  knowledge.  An  example  of  such  a  system,  the  ARGOT 
dialog  participation  system,  is  outlined  in  the  next  section. 

This  thesis  addresses  the  problem  of  building  a  knowledge  retriever  and  obtaining  a 
thorough  understanding  of  it.  Though  my  study  of  retrieval  began  by  designing  and  imple¬ 
menting  the  retriever  in  ARGOT,  and  the  retriever  examined  in  this  thesis  forms  the  core  of 
the  ARGOT  retriever,  the  goal  of  the  work  is  not  primarily  the  construction  of  a  particular 
retriever.  Rather,  its  goal  is  more  generally  to  gain  an  understanding  of  the  very  nature  of 
retrieval  and  to  produce  techniques  useful  in  addressing  this  and,  hopefully,  related  problems. 

I  intend  to  attain  these  general  goals  while  constructing  a  specific  retriever  by  carrying  out 
the  construction  in  a  principled  way.  First,  I  elucidate  certain  intuitions  about  the  nature  of 
retrieval  and  put  forth  criteria  that  a  retriever  must  satisfy.  Then  I  formally  specify  a  retriever 
that  captures  the  intuitions  and  meets  the  criteria.  Finally,  I  design  an  algorithm  that  imple¬ 
ments  the  specification. 

In  sum,  the  principal  contribution  of  this  thesis  is  the  transformation  of  retrieval  from  an 
ill-defined  unstudied  process  to  a  formally-defined  and  well-studied  one. 


1.1.  An  Overview  of  ARGOT’s  Organization 

A  very  brief  overview  of  ARGOT’s  organization  suffices  to  illustrate  the  structure  of  a  sys¬ 
tem  containing  a  KB  and  knowledge  retriever  and  to  show  how  these  components  are  used. 
Allen,  Frisch  and  Litman  (1982)  describe  ARGOT  in  more  detail. 

ARGOT  is  designed  to  play  the  role  of  a  computer  operator  partaking  in  extended  English 
dialogs  with  computer  users.  As  depicted  in  Figure  1.1,  the  system  is  divided  into  three 
concurrently-running  modules:  a  task  goal  reasoner,  a  communicative  goal  reasoner,  and  a 
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linguistic  reasoner.  All  declarative  knowledge  used  by  the  three  modules  is  stored  in  a  common 
knowledge  base  and  is  accessed  only  through  a  knowledge  retriever.  Hence,  the  only  view  of  the 
KB  available  to  the  system’s  modules  is  that  provided  through  the  query-processing  facilities  of 
the  retriever.  The  internal  structure  of  the  KB  is  completely  hidden. 

The  reasoning  modules  of  ARGOT  are  not  general-purpose  reasoning  systems  but  rather 
are  highly  specialized  for  the  tasks  they  perform.  On  the  other  hand,  the  retriever  is  not  spe¬ 
cialized  for  any  particular  task  such  as  linguistic  reasoning,  though  it  is  able  to  do  some  taxo¬ 
nomic  and  temporal  reasoning. 

1.2.  Knowledge  Retrieval  as  Inference 

At  all  points  throughout  this  thesis  I  keep  in  mind  that  a  KB  is  not  only  a  data  structure, 
but  also  a  representation.  By  this  I  mean  that  the  KB  makes  a  set  of  assertions,  sentences  to 
which  a  semantics  attributes  truth  conditions. 

In  response  to  a  query  a  retriever  returns  one  or  more  sentences.  (A  sentence  that  can  be 
retrieved  from  a  KB  by  some  query  is  said  to  be  retrievable.)  In  doing  this,  the  retriever  must 
respect  the  semantics  of  the  language  it  operates  on.  More  precisely,  a  retriever  should  return  a 
sentence  only  if  the  truth  of  the  sentence  is  assured  by  the  truth  of  the  KB,  that  is,  only  if  the 
sentence  is  entailed  by  the  KB. 

These  considerations  lead  to  the  viewpoint  that  retrieval  is  inference,  which  is  taken  to  be 
a  mechanism  that  derives  logical  consequences.  But  what  kind  of  inference  is  retrieval?  How 
much  and  what  kind  of  inference  capability  should  be  packaged  up  in  a  retriever? 

Let  us  address  these  questions  by  first  considering  the  extreme  position  of  making  a  retri¬ 
ever  as  strong  as  possible.  Such  a  retriever  would  be  able  to  retrieve  all  logical  consequences  of 
the  KB.  An  example  of  this  approach  is  the  KRYPTON  knowledge  representation  system 
(Brachman,  Pikes  and  Levesque,  1983;  Brachman,  Gilbert  and  Levesque,  1985),  which  employs 
a  complete  inference  mechanism  known  as  theory  resolution  (Stickel,  1985).  However,  in  dis¬ 
cussing  their  design  of  KRYPTON,  Brachman,  Gilbert  and  Levesque  (1985)  express  dissatisfac¬ 
tion  with  this  choice  of  inference  mechanism: 

We  would  no  doubt  have  used  a  more  computationally  tractable  inference  frame¬ 
work  than  full  first-order  logic  if  an  appropriate  one  were  available...  the  full 
first-order  resolution  mechanism  is,  in  a  sense,  too  powerful  for  our  needs. 

Indeed,  in  any  language  as  expressive  as  the  first-order  predicate  calculus  determining 
whether  one  sentence  entails  another  is  only  semi-decidable.  If  we  design  a  system  that  relinqu¬ 
ishes  control  to  its  retriever  then  we  must  demand  that  the  retriever  returns  control  and  that  it 
quickly  does  so. 

To  design  a  retriever  that  computes  all,  or  even  most,  of  the  logical  consequences  of  a  KB 
is  to  put  the  muscle  of  the  system  in  the  wrong  place.  The  power  of  the  system  belongs  in  the 


special-purpose  modules  of  the  system,  which  can  be  built  with  the  necessary  domain- 
dependent  control  structures,  not  in  the  knowledge  retriever,  which  is  a  domain-independent 
inference  engine. 

Efficiency  rather  than  power  is  the  primary  consideration  in  designing  a  retriever.  A  sys¬ 
tem  can  compensate  for  a  retriever  that  is  efficient  but  weak  by  performing  computations  itself, 
but  it  cannot  compensate  for  a  retriever  whose  excessive  power  leads  to  inefficiency. 

The  above  considerations  lead  to  the  conclusion  that  retrieval  is  limited  inference.  This 
raises  the  question  to  which  we  now  turn:  what  inferences  should  a  retriever  perform? 

1.3.  Retrieval  as  Pattern  Matching 

A  common  intuition,  whose  prevalence  can  be  witnessed  by  examining  the  range  of  retriev¬ 
ers  in  use,  is  that  retrieval  is  fundamentally  a  pattern-matching  operation.  According  to  this 
intuition  a  KB  consists  of  a  set  of  data  objects  and  a  query  supplies  a  target  pattern  to  which 
the  retriever  responds  by  reporting  which  data  objects  match  the  pattern.  Whether  or  not  a 
particular  data  object  matches  the  target  is  independent  of  the  other  data  objects.  Hence,  in  a 
parallel  implementation  each  data  object  can  be  realized  by  a  distinct  processor  and  there  is  no 
need  for  communication  between  them.  The  idea  of  using  pattern  matching  to  access  a 
memory  structure  is  a  familiar  one,  present,  for  example,  in  the  notion  of  a  content-addressable 
memory. 

A  strict  pattern-matching  system,  such  as  a  content-addressable  memory,  never  combines 
two  or  more  data  objects  in  order  to  match  a  target  successfully.  Doing  so  would  require  com¬ 
munication  between  the  processors  in  a  content-addressable  memory.  I  call  such  an  operation 
"chaining." 

In  general,  chaining  is  the  operation  of  combining  two  or  more  pieces  of  a  representation 
together  in  order  to  derive  a  third.  The  archetypal  form  of  chaining  occurs  in  applying  the  rule 
of  modus  ponens  to  infer  Q  from  P  and  P—>Q. 

The  simplest  form  of  pattern  matching  occurs  in  matching  an  object  against  itself.  Hence, 
if  retrieval  is  pattern  matching,  a  retriever  should  be  able  to  retrieve  everything  that  has  been 
added  explicitly  to  the  KB.  A  retriever  that  can  do  this  is  said  to  satisfy  the  verbatim  retrieval 
criterion.  For  example,  if  I  add  "PVQ"  to  a  KB,  then  its  retriever  should  report  "yes"  when  I 
query  "PvQ".  It  should  do  this  readily  by  matching  "PV<2"  to  itself. 

Contrast  this,  the  simplest  case  of  pattern  matching,  with  the  operation  of  a  resolution 
theorem  prover.  Using  a  refutation  strategy,  it  replaces  the  problem  of  proving  that  PwQ 
entails  P\/Q  with  the  problem  of  proving  that  {P\/Q,  -> P,  ->(?}  is  unsatisfiable.  It  can  solve 
this  problem  by  finding  one  of  the  resolution  refutations  of  Figure  1.2.  Each  of  these  refuta¬ 
tions  contains  two  applications  of  the  resolution  inference  rule.  Every  application  of  this  infer¬ 
ence  rule  resolves  two  clauses  together  to  derive  a  third  and  hence  involves  chaining. 
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To  increase  their  power,  pattern  matchers  are  often  extended  with  rewrite  or  deduction 
rules.  One  such  system  is  SNePS  (Shapiro,  1979),  a  semantic-network  system  that  a  user 
accesses  by  specifying  a  target  network-pattern  to  be  matched  against  the  network  in  the  KB. 
SNePS  also  allows  the  user  to  specify  backward  chaining  rules  that  are  used  to  attempt  to 
rewrite  target  patterns.  1  Database  systems  also  are  frequently  extended  with  facilities  to  han¬ 
dle  rewrite  rules;  such  systems  are  called  deductively-augmented  data  bases.  A  final  example  of 
the  incorporation  of  rewrite  rules  into  a  pattern  matcher  is  the  ubiquitous  use  of  inheritance  in 
semantic-network  systems.  Unlike  the  rewrite  rules  used  by  SNePS  or  deductively-augmented 
data  bases,  which  are  added  to  the  system  by  the  user,  the  rewrite  rules  that  constitute  inheri¬ 
tance  are  built  directly  into  semantic-network  retrievers. 

We  are  now  in  a  position  to  answer  the  question,  "What  is  retrieval?"'  Retrieval  is  an 
inference  process,  limited  in  such  a  way  that  it  is  fundamentally  a  pattern-matching  process, 
though  it  may  be  extended  to  do  a  bounded  amount  of  some  specialized  form  of  chaining.  Though 
I  take  this  as  being  an  accurate  characterization  of  retrieval,  it  is  an  intuitive  characterization, 
not  a  rigorous  or  precise  one.  This  is  appropriate  because  the  notion  of  retrieval  is  intuitive 
rather  than  technical.  The  intuitive  nature  of  this  characterization  arises  from  its  formulation 
in  terms  of  the  notion  of  pattern  matching,  itself  an  intuitive  notion. 

Let  us  examine  our  intuitions  about  pattern  matching  more  closely.  The  weakest 
pattern-matching  retriever  can  retrieve  only  what  is  contained  explicitly  in  the  KB.  However, 
pattern  matchers  can  often  succeed  at 

•  retrieving  P\/Q  from  a  KB  containing  QVP,  and 

•  retrieving  Q  from  a  KB  containing  PaQ- 

While  our  intuitions  clearly  allow  us  to  call  these  retrievals  pattern  matching,  our  intuitions  get 
murky  when  we  try  to  see  how  far  the  notion  extends.  Which  of  these  actions  should  be  called 
pattern  matching: 

•  retrieving  P\JQ  from  a  KB  containing  Q, 

•  retrieving  ->->P  from  a  KB  containing  P,  and 

•  retrieving  P—+Q  from  a  KB  containing  —>PVQ. 

Though  it  is  not  clear  where  the  intuitive  notion  of  pattern  matching  ends,  it  is  clear  that 
retrievals  involving  chaining,  such  as 

•  retrieving  Q  from  a  KB  containing  P  and  P—*Q, 
are  not  pattern  matching. 

Unlike  pattern  matching,  chaining  becomes  a  precise  term  once  a  method  of  breaking  a 
representation  into  quanta  has  been  determined.  Whether  or  not  a  given  retriever  performs 
chaining  is  then  a  cut  and  dried  question.  Because  the  notion  of  chaining  is  precise  and  is 
closely  aligned  with  the  intuitive  notion  of  pattern  matching,  1  henceforth  use  no-chaining. 


1  SNePS  also  allows  forward  chaining  rules  to  be  specified 


rather  than  pattern  matching,  as  the  defining  characteristic  of  retrieval. 

The  effect  of  this  no-chaining  restriction  depends  critically  on  the  granularity  at  which 
knowledge  is  quantized.  At  one  extreme,  if  the  entire  KB  is  considered  to  be  a  single  quantum, 
the  no-chaining  restriction  becomes  vacuous;  there  can’t  be  any  chaining  simply  because  there 
aren’t  two  quanta  to  be  chained  together.  At  the  other  extreme,  if  each  quantum  is  merely  an 
atomic  sentence  then  only  verbatim  retrievals  can  be  performed;  the  only  atomic  sentence 
entailed  by  an  atomic  sentence  is  the  sentence  itself. 

If  the  elimination  of  chaining  is  to  eliminate  all  inference  that  does  not  correspond  intui¬ 
tively  to  pattern  matching,  then  the  KB  must  be  divided  into  fine-grained  quanta.  As  an  illus¬ 
tration,  if  the  no-chaining  restriction  is  to  disallow 

•  retrieving  Q  from  a  KB  containing  P A{P  —+Q), 
then  the  KB  should  be  divided  into  two  quanta,  P  and  P—*Q.  I  shall  return  to  the  issue  of 
quantization  in  Chapter  2,  where  the  design  of  the  first  retriever  is  undertaken. 

1.4.  Thesis  Synopsis 

The  principal  technical  result  of  this  thesis  is  the  formal  specification  of  a  knowledge  retri¬ 
ever  that  performs  all  inferences  that  don’t  require  chaining,  all  inferences  that  involve  taxo¬ 
nomic  information,  but  no  others.  The  specified  retriever  is  a  highly  specialized  inference 
engine;  it  does  all  chaining  of  a  certain  sort  but  no  other  chaining.  Relative  to  the  way  that 
knowledge  is  quantized,  this  is  the  strongest  retriever  that  does  no  chaining  other  than  with  tax¬ 
onomic  information. 

This  retriever  operates  on  a  first-order  logic.  While  some  may  argue  that  this  language  is 
not  sufficiently  expressive  for  certain  AI  tasks  such  as  natural  language  processing,  it  is  expres¬ 
sive  enough  to  make  the  retrieval  problem  difficult.  Most  notably,  entailment  is  undecidable  in 
this  logic. 

This  retriever  is  specified  as  the  last  in  a  series  of  four  retrievers,  each  of  which  extends 
the  capabilities  of  the  previous  one.  The  four  specifications  are  presented  successively  in 
Chapters  2  through  5  of  this  document.  Accompanying  each  specification  is  a  rigorous  investi¬ 
gation  into  its  properties,  a  presentation  of  an  efficient,  terminating  algorithm,  and  a  proof  that 
the  algorithm  meets  the  specification. 

The  first  retriever,  which  operates  on  a  propositional  language,  handles  only  yes/no 
queries.  The  second  retriever  additionally  handles  wh-queries;  its  algorithm  introduces 
unification  into  the  simple  algorithm  of  the  first  retriever.  An  important  result  shows  that  even 
when  there  are  an  infinite  number  of  answers  to  a  wh-query,  the  set  of  answers  can  be  finitely 
characterized. 

The  third  retriever  extends  the  second  by  allowing  quantifiers  in  the  KB  and  the  query. 
First,  it  is  shown  that  the  obvious,  straightforward  treatment  of  quantifiers  results  in  a 


retriever  that  is  not  guaranteed  to  terminate.  Closer  examination  of  the  difficulty  reveals  that 
this  treatment  of  quantifiers  subtly  violates  the  no-chaining  restriction.  A  new  specification  is 
presented,  which  uniformly  respects  the  no-chaining  restriction  throughout  its  treatment  of  the 
entire  language,  including  the  quantifiers.  Analysis  of  the  relationship  between  quantifiers  and 
wh-queries  reveals  how  the  second  retrieval  algorithm  can  be  used  to  implement  this  third 
specification. 

Each  of  these  first  three  retrievers  meets  the  no-chaining  restriction;  moreover,  for  the 
class  of  problems  handled  each  is  the  strongest  retriever  that  meets  the  restriction,  modulo  the 
way  that  knowledge  is  quantized.  If  the  thesis  were  to  end  here,  it  could  aptly  be  entitled  "The 
Logic  of  No-Chaining." 

The  fourth  retriever  integrates  taxonomic  inference  such  as  inheritance  into  the  pattern¬ 
matching  inference  of  the  third  retriever.  The  algorithm  for  this  fourth  retriever  is  interesting 
in  that  it  uses  the  taxonomic  information  solely  during  unification.  A  theorem  is  presented  that 
states  the  conditions  that  are  necessary  and  sufficient  to  justify  this  computational  technique. 
The  technique  and  the  theorem  justifying  it  are  both  general  enough  to  be  used  to  add  a  taxo¬ 
nomic  component  to  almost  any  computational  system  that  works  with  schematic  variables, 
including  rewrite  systems,  grammars  and  their  parsers,  theorem-provers,  and  logic¬ 
programming  systems.  This  fourth  retriever  forms  the  core  of  the  retriever  incorporated  in 
ARGOT. 

1.5.  A  Method  for  Specifying  Retrievers 

Rather  than  specify  how  a  retriever  operates,  this  thesis  concentrates  on  specifying  what  a 
retriever  computes.  This  is  done  by  specifying  a  retrievability  relation  that  determines  what 
sentences  are  retrievable  from  what  KB’s,  just  as  a  provability  relation  determines  what  sen¬ 
tences  are  provable  from  what  sets  of  axioms.  Thus,  retrievability  is  a  relation  between 
knowledge  bases  and  sentences.  If  knowledge  base  kb  and  sentence  q  stand  in  this  relation  then 
we  say  that  q  is  retrievable  from  kb  and  write  kb—*R  q.  Thus,  the  problem  of  specifying  a  retri¬ 
ever  comes  down  to  one  of  specifying  a  retrievability  relation.  This  section  puts  forward  a 
model-theoretic  specification  method  that  is  used  throughout  this  work.  2 

Each  of  the  four  retrievers  of  this  thesis  operates  on  a  language  whose  semantics  is  given 
by  a  standard  Tarskian  (1935)  model  theory.  Each  model  theory  yields  an  entailment  relation 
which,  analogous  to  a  provability  or  retrievability  relation,  defines  what  sets  of  sentences  entail 
what  sentences.  Entailment  relations  are  well-suited  for  use  in  specifications  because  they  are 
precise,  often  have  simple  definitions,  and  abstract  away  from  all  issues  of  formal  syntactic 

*  Frisch  (1985a)  provides  a  more-general  discussion  of  this  technique 
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operations.  However,  Tarskian  entailment  cannot  be  used  to  specify  retrievability;  what  is 
retrievable  from  a  KB  is  but  a  small  subset  of  what  is  entailed  by  a  KB. 

The  approach  advocated  here  for  specifying  retrievability  is  to  produce  another  model 
theory  whose  entailment  relation  is  weakened3  in  such  a  way  that  the  retriever  is  a  sound  and 
complete  inference  engine  with  respect  to  it.  Many  people  initially  find  this  approach  quite  odd. 
They  are  accustomed  to  thinking  of  a  model  theory  as  specifying  what  can  be  concluded  validly 
from  what — in  some  sense,  as  a  competence  theory  of  inference.  I  suggest  that  those  who  are 
comfortable  with  this  viewpoint  consider  the  weaker  model  theory  as  a  performance  theory  of 
inference.  Other  people  are  accustomed  to  thinking  of  a  model  theory  as  a  way  of  assigning 
meaning  to  symbols  and  are  skeptical  of  producing  a  new  meaning  assignment.  But  1  am  not 
suggesting  that  the  original  model  theory  be  discarded;  on  the  contrary,  it  is  still  a  valuable 
device  in  the  study  of  meaning.  The  new  model  theory  can  be  thought  of  as  providing  an  addi¬ 
tional  meaning  assignment.  If  the  retriever  is  working  under  this  alternative  meaning  then  it  is 
a  complete  inference  engine.  Hence,  the  symbols  mean  one  thing  to  us  and  another  to  the  retri¬ 
ever.  According  to  our  theory  of  meaning  the  retriever  is  incomplete  but  according  to  its 
weaker  theory  of  meaning  it  is  complete. 

How  can  these  new,  weaker  model  theories  be  produced  and  what  is  their  relationship  to 
the  unweakened  model  theory?  To  answer  the  question  consider  a  model  theory  as  laying  down 
a  set  of  constraints  on  what  constitutes  a  model.  Of  all  (mathematical)  objects,  only  those  that 
satisfy  the  constraints  qualify  as  models.  A  model  theory  also  associates  with  each  model  a 
valuation,  a  total  function  from  sentences  to  their  truth  values.  Hence,  a  model  theory  con¬ 
strains  the  range  of  valuations  that  can  be  generated.  In  a  standard  propositional  logic,  for 
example,  these  constraints  ensure  that  any  valuation  that  assigns  two  sentences  True,  also 
assigns  their  conjunction  True.  The  entailment  relation  associated  with  a  model  theory  is  a 
product  solely  of  the  range  of  valuations  that  the  model  theory  generates.  Relaxing  the  con¬ 
straints  produces  a  new  model  theory,  one  that  may  generate  additional  valuations.  No  matter 
how  the  constraints  are  relaxed,  the  new  model  theory  must  have  a  weaker  entailment  relation 
than  the  original.  That  is,  if  Hj  and  are  entailment  relations  and  h|  is  obtained  by  relaxing 
the  model  theory  for  Hj ,  then  a  \^/3  implies  a  bj  /?.  To  see  this,  observe  that  a  valuation  can 
serve  only  as  a  counterexample  to  a  claim  that  one  sentence  entails  another;  hence  if  none  of  the 
valuations  from  the  relaxed  model  theory  are  counterexamples  then  certainly  none  from  the  ori¬ 
ginal  model  theory  are. 


*  One  entailment  relation  is  weaker  than  another  if  the  inferences  sanctioned  by  the  first  are  a  subset  of  those 
sanctioned  by  the  second 


1.5.1.  A  Toy  Example 

Consider  a  program  that  reasons  about  an  arbitrary  equivalence  relation  named  "r".  A 
user  communicates  with  the  program,  making  assertions  and  queries,  each  of  which  specifies  a 
sentence  of  the  form  r(a,0),  where  a  and  0  are  symbols  drawn  from  some  lexicon. 

This  program  can  be  specified  in  terms  of  the  symbolic  manipulations  it  performs.  There 
are  many  conceivable  specifications  but  for  the  sake  of  argument  let  us  say  that  the  program 
works  by  maintaining  a  collection  of  disjoint  sets.  Initially  there  is  a  unit  set  for  each  symbol  in 
the  lexicon.  Whenever  the  user  asserts  r(a,0)  the  program  combines  the  sets  that  contain  a 
and  0  into  a  single  set.  To  respond  to  the  query  r[ot,0)  the  program  simply  determines  whether 
the  elements  a  and  0  are  in  the  same  set.  There  are  many  well-studied  algorithms  for  perform¬ 
ing  this  set-union  task  (Tarjan  and  van  Leeuwen,  1984),  the  best  of  which  can  process  a  series 
of  n  assertions  and  queries  in  slightly  greater  than  O(n)  time. 

To  the  user  of  the  system  this  specification  is  too  detailed  and  too  concrete.  He  does  not 
need  to  know,  nor  does  he  care,  whether  the  program  works  one  way  or  another.  At  the  level  of 
abstraction  with  which  the  user  is  concerned  the  various  implementations  are  all  identical.  (Of 
course,  for  other  concerns,  such  as  implementation,  there  is  a  world  of  difference  between 
different  specifications.)  A  more  abstract,  non-procedural  definition  of  this  reasoning  system 
can  be  obtained  by  replacing  the  question  "What  does  the  reasoner  do?"  with  "Given  a  set  of 
previously-asserted  sentences,  what  query  sentences  succeed?"  At  this  higher  level  of  abstrac¬ 
tion  the  various  set-union  algorithms  are  all  equivalent. 

In  response  to  the  question  of  what  queries  succeed,  it  can  be  shown  that  the  program 
described  above  answers  "yes"  to  a  query  if,  and  only  if,  it  legitimately  can  do  so  based  on  what 
it  has  been  told.  That  is,  it  answers  "yes"  if,  and  only  if,  the  queried  sentence  is  entailed  by  the 
set  of  asserted  sentences.  Forgive  my  pedantry  while  I  spell  out  the  obvious  details  of  the 
model  theory  that  gives  rise  to  the  entailment  relation  for  this  language;  these  details  will  be 
valuable  in  considering  how  to  relax  a  model-theoretic  specification. 

Each  of  the  model  theories  discussed  in  this  thesis  is  given  a  name.  The  one  presented 
next  is  called  ”E".  In  cases  where  confusion  could  arise,  terms  like  "E-model"  and  "E- 
entailment"  and  symbols  like  are  used  to  indicate  which  model  theory  is  under  considera¬ 
tion. 

An  E-model  is  a  pair  (D  ,A)  where  D  is  a  non-empty  set  of  individuals  called  the  domain 
and  A  is  a  function  that  maps  every  symbol  in  the  lexicon  to  an  element  of  D  and  maps  r  to  a 
binary  relation  over  D  such  that: 

(1)  A  (r )  is  reflexive, 

(2)  A(r)  is  symmetric,  and 

(3)  A  (r )  is  transitive. 

The  valuation  associated  with  (I)  ,A)  is  the  function  that  takes  each  sentence  of  the  form 
r(a,0)  to  True  if  the  relation  A(t)  holds  between  /1(a)  and  /l(,f),  and  to  False  otherwise. 


These  valuations  can  be  used  in  the  usual  fashion  to  define  the  notions  of  E-satisfiability,  E- 
validity,  and  E-entailment  for  this  language. 

This  model  theory  serves  two  purposes  in  analyzing  the  program.  First,  it  provides  a 
rigorous  semantics  for  the  language  that  the  program  manipulates  and  in  doing  so  defines 
entailment  for  the  language.  Second,  it  is  used  in  specifying  what  the  program  computes  by 
stipulating  that  it  responds  "yes"  if,  and  only  if,  the  queried  sentence  is  E-entailed  by  the 
asserted  sentences.  In  the  case  of  this  program  the  two  uses  go  hand-in-hand  because  the  pro¬ 
gram  is  a  sound  and  complete  inference  engine.  But  it  is  important  to  distinguish  between 
these  two  uses  of  a  model  theory  as  we  turn  our  attention  to  a  reasoning  program  that  is  not 
complete. 

Suppose  that  for  some  reason  we  were  not  happy  with  a  program  that  required  slightly 
greater  than  O(n)  time  to  process  a  series  of  n  assertions  and  queries.  (I  told  you  this  was  a 
toy  example!)  Furthermore,  suppose  that  we  were  willing  to  replace  the  set-union  algorithm 
with  the  following  algorithm,  which  is  much  weaker  but  slightly  faster.  Whenever  r(a,/3 )  is 
asserted,  the  program  adds  the  pair  {a,/?)  to  an  associative  store.  The  program  responds  "yes" 
to  the  query  r(a,/? )  if,  and  only  if,  alpha  and  beta  are  identical  or  the  associative  store  contains 
either  (a. 8)  or  {/3,a). 

This  program  is  incomplete  with  respect  to  E.  For  example,  if  only  r(a,b)  and  r(b.c) 
have  been  asserted,  the  query  r(a,c)  will  not  result  in  "yes"  even  though  the  queried  sentence  is 
E-entailed  by  the  two  asserted  sentences.  E  still  gives  a  semantics  for  the  language  manipu¬ 
lated  by  the  program  but  it  no  longer  specifies  what  the  program  computes.  However,  there  is 
a  weaker  model  theory — call  it  E* — whose  entailment  relation  does  specify  the  input/output 
relation  of  this  program.  Ew  is  identical  to  E  except  that  constraint  (3),  which  says  that  A{r) 
must  be  transitive,  is  eliminated.  With  respect  to  Ew  the  program  is  a  sound  and  complete 
inference  engine — though  admittedly  soundness  and  completeness  are  normally  taken  to  be  with 
respect  to  a  model  theory  that  specifies  the  meaning  of  the  language. 

Every  model  in  E  is  also  a  model  in  Ew,  but  riot  vice-versa.  For  example,  consider  thc- 
model  (D \A ')  where  D1  =  \  1 ,2,3  j  and  A '  is  such  that: 

A'(a)  =  1 
A  '(b)  =  2 

A'(c)  =  3 

A'(r)  =  {(1,1), (2,2), (3,3), (1,2), (2,1), {2,3), (3,2),' 

This  model  respects  constraints  (1)  and  (2)  but  not  constraint  (3).  The  valuation  gen¬ 
erated  by  { D',A ')  is  not  generated  by  any  E-model,  hence  t^.„  is  strictly  weaker  than  (=£  . 
Returning  to  the  example  previously  cited,  { D',A ')  demonstrates  that  r(a,6)  and  r(6,c)  do  not 
Ew-entail  r(a,c ). 


Though  this  example  program  specification  is  quite  simple  it  has  illustrated  many  of  the 
major  points  about  the  method  of  specifying  programs  with  model  theory. 

1.6.  Requirements  on  Retrievability 

There  are  three  restrictions  that  I  place  on  any  retrievability  relation.  Each  of  these 
already  has  been  discussed  more  or  less  explicitly,  but  here  they  are  made  explicit  and  precise 
by  presenting  them  in  terms  of  the  restrictions  they  impose  on  a  retrievability  relation. 

If  kb  is  a  set  of  sentences  and  q  a  single  sentence  of  some  language  whose  Tarskian  entail- 
ment  relation  is  hj-  then  any  retrievability  relation  — +R  for  that  language  must  satisfy: 
soundness:  kb—*Rq  only  if  A:6  q. 
verbatim  retrieval:  if  q  £  kb,  then  kb-*-Rq. 
decidability:  —*R  is  a  decidable  relation. 

The  first  requirement  demands  that  no  sentence  can  be  retrieved  from  a  KB  unless  it  is 
entailed  by  the  KB,  while  the  second  demands  that  sentences  explicitly  in  the  KB  are  retriev¬ 
able.  Both  of  these  requirements  are  met  by  any  retrievability  relation  that  is  the  entailment 
relation  of  a  model  theory  obtained  by  relaxing  T.  As  previously  argued,  relaxing  constraints  in 
a  model  theory  can  only  weaken  its  entailment  relation  and  therefore  the  soundness  require¬ 
ment  is  met.  The  verbatim  retrieval  requirement  quite  clearly  is  met  by  any  entailment  rela¬ 
tion. 

It  is  a  virtue  of  the  model  theoretic  specification  technique  that  it  imposes  constraints  on 
the  relations  that  can  be  specified,  some  of  which  are  demanded  independently  by  the  nature  of 
the  relations  that  we  are  attempting  to  specify.  Consequently,  relations  that  fail  to  satisfy  the 
soundness  and  verbatim  retrieval  requirements  cannot  even  arise  in  this  study.  This  contrasts 
with  earlier  work  (Frisch  and  Allen;  1982)  employing  a  more  syntactic  form  of  specification  in 
which  it  was  necessary  to  prove  that  each  specified  retriever  met  these  requirements. 

Given  that  the  retrievability  relations  of  this  thesis  are  produced  by  the  advocated 
model-theoretic  method,  only  the  decidability  requirement  need  concern  us.  This  requirement, 
which  ensures  that  the  retriever  can  be  realized  by  an  effective  procedure  that  is  guaranteed  to 
terminate,  is  weaker  than  it  ideally  should  be — that  the  retriever  could  be  realized  by  a  pro¬ 
cedure  requiring  only  some  small  amount  of  computational  resources.  As  previously  discussed, 
the  first  three  retrievers  attempt  to  achieve  this  required  efficiency  through  the  elimination  of 
chaining.  Achieving  decidability  in  the  fourth  retriever  requires  that  all  necessary  taxonomic 
reasoning  can  be  performed  with  only  a  bounded  amount  of  chaining. 

The  three  requirements  discussed  in  this  section  can  be  captured  poetically,  as  well  as  for¬ 
mally: 


Insist  what’s  explicit  is  found 
Within  computational  bound. 

And  what’s  retrievable 
Must  be  believable, 

So  make  sure  that  inference  is  sound. 

1.7.  Other  Methods  for  Limiting  Inference 

The  computing  literature  contains  a  number  of  approaches  to  limiting  inference.  This  sec¬ 
tion  reviews  several  of  these  and  argues  that  none  of  them  is  appropriate  for  limiting  inference 
in  a  retriever.  This  should  not  be  surprising,  for  each  of  the  approaches  was  developed  for  pur¬ 
poses  other  than  retrieval. 

One  way  of  limiting  inference  is  to  restrict  the  expressiveness  of  the  representation 
language  used  to  express  the  knowledge  in  the  KB.  Such  restrictions  can  simplify  the  decision 
of  whether  a  given  fact  is  entailed  by  the  facts  in  the  KB.  This  is  the  approach  taken  to  yield 
efficient  retrieval  in  data  bases.  For  example,  even  in  the  area  of  logic  data  bases  where  the 
emphasis  is  on  more  expressive  languages,  data  bases  often  are  limited  to  the  Horn-clause  sub¬ 
set  of  First-Order  Predicate  Calculus.4  This  restriction  precludes  expressing  "Either  John  or  his 
brother  has  the  money"  without  expressing  which  one  has  the  money.  This  approach  to  simpli¬ 
fying  retrieval  can  also  take  the  form  of  making  assumptions  about  the  domain  and  its  relation¬ 
ship  to  the  representation  language.  Common  assumptions  are  that  the  domain  is  finite,  that 
each  individual  symbol  in  the  language  denotes  a  distinct  individual  (E-saturation  or  unique 
name  assumption;  Reiter  (1980;  1984)),  that  each  individual  is  denoted  by  some  individual  sym¬ 
bol  (Domain  closure;  Reiter  (1980;  1984)),  and  that  everything  that  cannot  be  proved  true  is 
false  (Closed  World  Assumption;  Reiter  (1978;  1984)).  The  general  acceptance  of  this  approach 
to  achieving  efficient  retrieval  distinguishes  the  field  of  data  bases  from  the  field  of  A1 
knowledge  representation. 

Assumptions  of  the  sort  described  above  should  not  be  built  into  an  AI  knowledge 
representation  language.  To  do  so  would  restrict  the  set  of  situations  in  which  a  system  could 
perform  intelligently.  These  assumptions  are  just  not  true  of  our  everyday  world  and  its  rela¬ 
tionship  to  any  system  possessing  common  sense. 

A  representation  should  define  a  set  of  valid  inferences  that  could  be  made,  not  those  that 
are  made.  Even  if  the  retriever  only  makes  a  small  portion  of  all  valid  inferences  the  remaining 
possibilities  must  be  available  for  the  reasoner  to  consider. 

4  Gallaire,  Minker,  and  Nicolas  (1978)  overview  the  field  of  logic  data  bases  in  general,  and  define  Horn  clauses 
in  particular.  Kowalski  (1979)  treats  Horn  clauses  in  depth. 
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Another  common  approach  to  limiting  inference  is  to  restrict  the  amount  of  resources  used 
in  the  computation  (Norman  and  Bobrow,  1975;  Bobrow  and  Winograd,  1977;  Robinson  and 
Sibert,  1981).  This  can  be  done  by  restricting  computation  time,  the  total  number  of  inference 
steps  taken,  or  the  depth  of  the  inference.  These  approaches  are  unsuitable  for  knowledge 
retrieval  because  they  limit  all  forms  of  inference  uniformly.  For  example,  if  inference  is  lim¬ 
ited  to  a  depth  of  5,  then  properties  cannot  be  inherited  down  6  levels  of  a  type  hierarchy.  In 
general,  there  may  be  some  kinds  of  inference  that  we  want  to  be  computed  completely  and  oth¬ 
ers  that  we  want  to  be  ignored  completely.  A  methodology  for  limiting  inference  for  retrieval 
should  provide  the  designer  with  enough  control  to  pick  and  choose  the  inferences  that  he  wants 
to  be  performed. 

A  further  class  of  limited  inference  systems  consists  of  the  incomplete  theorem  provers 
that  are  fairly  common  in  the  literature:  for  example,  Brown’s  (1978)  system.  Typically,  these 
systems  are  not  guaranteed  to  terminate,  and  often  fail  to  meet  the  verbatim  retrieval  cri¬ 
terion. 

A  method  of  limiting  inference  that  has  been  suggested  to  me  is  to  find  a  traditional  proof 
system  in  the  mathematical  literature  consisting  of  a  set  of  axioms  and  inference  rules  and  then 
eliminate  some  of  these  inference  rules  and/or  axioms.  But  there  is  no  reason  to  believe  that 
the  distinctions  between  the  kinds  of  inference  drawn  by  these  inference  rules  and  axioms  are 
the  distinctions  that  I  want  to  draw. 


1.8.  A  Note  on  Notation 

I  have  attempted  to  use  standard  notation  and  terminology  as  far  as  possible.  Most  of  the 

notation  and  terminology  is  defined  when  it  is  first  used.  However,  it  is  worth  introducing  some 

at  the  very  outset: 

•  Recalling  that  a  h' 0  means  that  no  model  both  satisfies  a  and  falsifies  /?,  a  model  that 
satisfies  a  and  falsifies  0  is  called  a  countermodel  to  a  M. 

•  A  literal  is  an  atomic  formula  or  its  negation.  The  former  are  called  positive  literals  and  the 
latter,  negative  literals.  If  a  is  an  expression  or  a  set  of  expressions,  LITERALS(a)  is  the 
set  of  all  literals  occurring  in  a. 

•  If  d*  is  a  formula  then  its  complement,  written  4>c ,  is  defined  by: 

<f>e  =  rj>  if  <t>  =  ->«/’ 

—  ->d>  otherwise 

•  A  model  theory  is  said  to  be  decidable  or  undecidable  according  to  the  decidability  of  its 
entailment  relation. 

•  As  previously  stated,  every  model  theory  discussed  in  this  thesis  is  given  a  name.  T  is  the 
Tarskian  model  theory.  It  names  the  Tarskian  model  theory  for  whatever  language  happens 
to  be  under  discussion. 
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The  least  upper  bound  of  a  set  X  is  written  as  \JX. 

In  introducing  a  formula  <f>,  I  may  write  4>\xlt  .  .  .  ,xn}.  Subsequently,  1  write  4>\ty,  .  .  .  ,  tn 
to  refer  to  that  formula  which  results  from  replacing  all  occurrences  of  x,  in  <?  by  /,  (for 
l<»<n  ). 

If  <f>  is  a  formula  containing  free  variables  xlr  ...  ,xn,  then  its  universal  closure,  denoted  by 
is  Vij,  .  .  .  ,xn<f>,  and  its  existential  closure,  denoted  by  3<^>,  is  3ij,  .  .  .  ,xn<f>. 


Figure  1.1:  The  Organization  of  ARGOT 


Figure  1.2:  Resolution  Refutations  of  {PVQ,  -P,  ->Q} 


Chapter  2 


A  Retriever  for  a  Propositional  Language 

This  chapter  develops  and  studies  a  retriever  for  a  propositional  language  whose  truth- 
functional  semantics  is  given  by  the  standard  Tarskian  (1935)  model  theory.  The  retrievability 
relation  developed  is  the  entailment  relation  of  RP,  a  mode)  theory  obtained  by  relaxing  the 
Tarskian  model  theory. 

Even  within  the  confines  of  a  propositional  language  one  encounters  the  most  fundamental 
issues  in  the  study  of  retrieval:  How  can  a  representation  be  quantized?  How  can  a  Tarskian 
model  theory  be  relaxed  to  obtain  a  model  theory  that  does  not  allow  these  quanta  to  be 
chained?  After  chaining  is  eliminated,  what  sort  of  inferences  remain?  What  properties  does  a 
"logic  of  no-chaining"  have  and  how  do  they  compare  to  the  properties  of  the  Tarskian  logic 
from  which  it  was  derived? 

The  principal  results  of  this  chapter  include: 

•  a  workable  definition  of  a  "quantum"  of  representation, 

•  the  specification  of  RP,  a  propositional  logic  that  is  a  relaxation  of  T, 

•  a  proof  that  no  form  of  chaining  is  valid  in  RP, 

•  a  proof  that,  modulo  the  way  that  the  representation  is  quantized,  RP  is  the  strongest 
logic  that  allows  no  chaining,  and 

•  a  specification  of  a  retrieval  algorithm  that  meets  this  model-theoretic  specification, 
i.e. ,  a  decision  procedure  for  RP. 

2.1.  The  Quantifier-Free  Predicate  Calculus 

The  retriever  studied  in  this  section  operates  on  the  sentences  of  a  language  called 
"Quantifier-Free  Predicate  Calculus"  (QFPC).  That  is,  throughout  this  chapter  a  KB  consists 
of  a  finite  set  of  QFPC  sentences  and  a  query  specifies  a  QFPC  sentence  to  be  retrieved.  There¬ 
fore,  the  retrieval  problems  considered  in  this  chapter  are  those  that  can  be  represented  as  a 
finite  QFPC  sequent,  a  sequent  whose  antecedent  is  a  finite  set  of  QFPC  sentences  and  whose 
consequent  is  a  single  QFPC  sentence. 

Syntactically  and  semantically,  QFPC  is  identical  to  First-Order  Predicate  Calculus 
except  that  QFPC  has  no  quantifiers.  Briefly,  the  lexicon  of  QFPC  consists  of  a  set  of  variables, 
a  set  of  function  symbols,  and  a  set  of  predicate  symbols.  In  the  usual  manner,  variables  and 
function  symbols  can  be  combined  to  form  terms,  which  can  be  combined  with  predicate  sym¬ 
bols  to  form  atomic  formulas,  which  in  turn  can  be  combined  with  the  logical  connectives  to 
form  molecular  formulas.  As  usual,  a  sentence  is  a  formula  with  no  free  variables.  Therefore, 
since  QFPC  has  no  quantifiers,  a  QFPC  sentence  has  no  variables  whatsoever. 


It  may  appear  curious  that  QFPC  allows  variables  in  its  formulas  even  though  the  retri¬ 
ever  only  deals  with  variable-free  sentences.  However,  there  is  reason  to  this  madness.  Accom¬ 
modating  variables  into  the  analysis  of  this  chapter  adds  little  complexity  but  greatly  simplifies 
matters  in  Chapter  4  when  the  retriever  is  extended  to  handle  quantifiers.  For  example,  this 
chapter  presents  a  theorem  concerning  the  application  of  a  normal-form  transformation  to  for¬ 
mulas.  Because  this  result  concerns  all  formulas — not  just  sentences — it  is  applicable  to  the 
more-general  circumstances  that  arise  in  Chapter  4. 

This  is  a  good  place  to  reiterate  that  I  never  speak  of  the  value  of  an  open  formula  relative 
to  a  model;  it  only  has  a  value  relative  to  a  model  and  an  assignment  of  values  to  variables 
(henceforth  simply  called  a  value  assignment).  A  sentence,  however,  does  have  a  value  relative 
to  a  model,  and  can  be  satisfied  or  falsified  by  a  model.  Accordingly,  only  sentences  participate 
in  entailment  relations — including  bjjp  ,  which  specifies  the  retriever  of  this  chapter. 

2.2.  A  Matter  of  Fact 

Recall  my  original  suggestion  that  a  retriever  should  operate  by  dividing  the  KB  and 
query  into  quanta  called  "facts"  and  then  performing  retrieval  on  a  fact-by-fact  basis.  In  other 
words,  a  query  succeeds  if,  and  only  if,  each  of  its  facts  is  retrievable  from  a  single  fact  in  the 
KB.  However  this  in  itself  does  not  necessarily  lead  to  an  efficient  retriever  since  it  may  still  be 
difficult  to  decide  whether  a  single  fact  is  retrievable  from  a  single  fact.  Two  extreme 
approaches  could  be  used  to  obtain  an  easily-decidable  retrievability  relation  between  facts. 
One  is  to  allow  facts  to  be  complex  objects  and  to  make  retrievability  between  facts  be  a  weak 
relation  in  comparison  to  T- entailment .  The  other  approach  is  to  allow  facts  to  be  only  simple 
objects  and  to  make  retrievability  between  facts  be  a  strong  relation  in  comparison  to  T- 
entailment.  If  the  first  route  is  taken,  a  version  of  the  original  problem  remains:  How  should 
T-entailment  be  weakened  to  yield  an  efficient  retrievability  relation  over  the  set  of  facts7 

I  pursue  the  second  route.  Facts  are  defined  to  be  so  syntactically  simple  that  retrievabil¬ 
ity  over  the  set  of  facts  can  be  taken  as  T-entailment.  A  retriever  designed  in  this  way  occu¬ 
pies  a  privileged  position;  modulo  the  definition  of  fact,  it  will  be  the  strongest  retriever  that 
does  no  chaining. 

These  considerations  impose  three  criteria  on  the  way  that  facts  and  the  RP  model  theory 
are  defined.  Let  us  now  make  these  criteria  explicit.  If  <pl  and  $2  are  facts  and  4>  is  a  set  of 
facts  then: 

No  chaining:  4>  bj?p  4>x  iff  for  some  <££<!>  <f>  bjjp  <t>x. 

Strength:  dfi  bj)p  ^2  iff  dfi  b^ 

Efficiency:  There  is  a  simple  algorithm  that  decides  whether  <px  I  T  <p2. 

Taken  together,  the  Strength  Criterion  and  the  Efficiency  Criterion  demand  that  T 
entailment  between  facts  must  be  easily  decidable.  Since  deciding  1'  entailment  between 


arbitrary  sentences  of  QFPC  is  intractable,  facts  must  be  expressively  weaker  than  sentences  in 
general;  that  is,  it  must  be  that  there  are  sentences  that  are  not  T-equivalent  to  any  fact.  The 
two  most  obvious  choices  are  to  take  a  fact  as  a  disjunction  of  variable-free  literals  or  as  a  con¬ 
junction  of  variable-free  literals.  I  have  chosen  the  former.  I  know  of  no  reason  why  the  other 
choice  is  not  viable  and,  though  not  pursued  here,  an  investigation  of  the  consequences  of  that 
choice  would  be  worthwhile.  However,  of  the  two  choices,  a  disjunction  of  literals  provides  a 
more  natural  basis  for  knowledge  representation. 

The  following  theorem  assures  us  that  with  the  above  definition  of  "fact"  it  is  possible  to 
satisfy  the  Strength  Criterion  without  violating  the  Efficiency  Criterion. 

T— Decision  Theorem  for  Facts 

For  any  facts,  4>1  and  <f>2 ,  dq  Hj.  tf>2  iff  either 

(1)  d>2  has  complementary  literals,  or 

(2)  LITERALS (dq)  C  LITERALS (<£2). 

Proof 

If  clause:  Assume  Condition  (2).  From  the  definition  of  V  observe  that  a  T-model  satisfies  a 
disjunction  iff  it  satisfies  one  of  the  disjuncts.  Therefore,  any  T-model  satisfying  di  also 
satisfies  <f>2,  i.e.,  dq  Hf-  <p2.  On  the  other  hand,  assume  Condition  (l).  From  the  T-truth  table 
for  -i  observe  that  each  literal  or  its  complement  is  satisfied  by  any  T-model.  Thus,  (by  the 
argument  above)  any  disjunction  containing  complementary  literals  is  satisfied  by  every  model, 
and  therefore  dq  <p2. 

Only-if  clause:  Assuming  neither  Condition  (1)  nor  (2)  is  satisfied,  I  construct  M,  a  T-model 
that  satisfies  dq  but  falsifies  <f>2.  Let  M  assign  False  to  every  literal  in  <f>2.  This  is  possible  since 
by  assumption  no  literal  occurs  both  positively  and  negatively  in  <f>2.  Therefore  M  falsifies  <f>2. 
An  obvious  consequence  of  the  initial  assumption  is  that  <f> j  contains  some  literal  not  contained 
in  d>2.  Call  it  L.  If  Lc  is  one  of  the  disjuncts  in  <f>2  then  M  assigns  False  to  Lc  and  hence  True 
to  L.  Otherwise,  we  are  free  to  let  M  assign  True  to  L.  In  either  case,  M  assigns  True  to  L 
and  therefore,  by  the  definition  of  V,  satisfies  <p1.  ■ 

The  two  syntactic  conditions  for  T-entailment  can  be  easily  computed.  Even  if  facts  are 
encoded  as  unordered  lists,  this  decision  can  be  made  in  0(n  log  n)  time,  where  n  is  the  sum  of 
the  lengths  of  the  facts.  This  could  be  reduced  further  by  a  better  encoding  of  facts.  Proof 
complexity  is  another  important  measure.  A  proof  that  one  fact  T-entails  another  would  have 
to  demonstrate  either  that  every  member  of  a  certain  set  of  literals  was  contained  in  another  or 
that  a  certain  set  contained  complementary  literals.  The  size  of  each  of  the  necessary  demons¬ 
trations  is  proportional  to  the  size  of  the  facts  involved. 
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2.3.  Defining  RP 

With  the  definition  of  "fact"  in  hand  we  are  ready  to  relax  the  constraints  on  T  in  order  to 
produce  RP.  In  doing  so,  we  can  safely  ignore  the  Efficiency  Criterion  because  of  the  previous 
theorem,  and  thereby  concentrate  on  satisfying  the  No-Chaining  and  the  Strength  Criteria. 
However,  this  is  tricky  because  these  criteria  apply  opposing  forces;  the  No-Chaining  Criterion 
demands  a  logic  that  is  weak  in  a  certain  way,  while  the  Strength  Criterion  demands  a  logic 
that  is  strong  in  another  way. 

It  is  tempting  to  try  to  produce  the  specification  of  the  relaxed  model  theory  by  following 
the  tactic  used  in  Section  1.5.1  of  simple  textual  deletion  of  some  constraint  on  what  constitutes 
an  unrelaxed  model.  However,  in  the  case  of  T  it  is  not  so  straightforward.  E  specifies  three 
constraints  on  the  relations  that  can  be  assigned  to  the  symbol  r  and  thus  prevents  the  atomic 
sentences  of  the  language  from  obtaining  certain  combinations  of  truth  values.  (Recall  that  all 
sentences  in  the  object  language  of  E  are  atomic.)  Relaxing  E  to  obtain  Ew  involves  deleting 
one  of  the  three  constraints,  allowing  the  atomic  sentences  to  be  given  additional  combinations 
of  the  two  truth  values. 

Unlike  E,  T  places  no  constraints  on  what  a  model  can  assign  to  a  predicate  symbol  and 
therefore  the  atomic  formulas  of  the  language  can  be  assigned  any  combination  of  the  two  truth 
values.  So  the  strategy  of  generating  additional  valuations  by  giving  the  atomic  sentences  more 
combinations  of  the  two  truth  values  cannot  be  pursued  in  this  case.  This  leaves  two  options: 
either  allow  atomic  sentences  to  be  mapped  to  values  other  than  True  or  False  or  modify  the 
way  values  are  assigned  to  molecular  formulas.  This  thesis  pursues  the  first  strategy.  Else¬ 
where  (Frisch,  1985a)  I  have  specified  a  nearly  identical  retriever  using  the  second  strategy. 

Examination  of  why  P  and  -i PvQ  T-entails  Q  provides  the  insight  used  to  derive  the 
new  model  theory,  RP.  Consider  this  three-step  argument  that  P,  ->P\jQ  Hj.  Q: 

(1)  Assume  that  P  and  —>PvQ  are  both  satisfied  by  a  certain  model. 

(2)  Since  P  is  satisfied,  —> P  isn’t. 

(3)  Consequently,  if  the  model  is  to  satisfy  - 'P\/Q ,  as  assumed,  it  must  satisfy  Q. 

As  far  as  chaining  is  concerned,  step  (2)  is  the  crucial  one;  it  connects  P  and  —tP\/Q.  The  vali¬ 
dity  of  the  step  rests  on  the  assumption  that  a  model  satisfies  only  one  of  P  and  —<P — a 
justified  assumption  in  T  where  a  model  assigns  each  sentence  either  True  or  False,  but  never 
both.  RP  relaxes  the  restriction  that  the  assignments  of  True  and  False  are  exclusive  by  allow¬ 
ing  each  sentence  to  be  assigned  a  non-empty  subset  of  {True,  False}.  Hence,  RP  has  three 
truth  values:  {True},  {False}  and  {True,  False}.  We  will  see  that  this  modification  admits 
models  that  satisfy  both  P  and  — > P,  thus  eliminating  modus  ponens  as  a  sound  rule  of  infer¬ 
ence. 

Let  us  make  this  precise.  Like  a  T  model,  an  RP-model  is  a  pair  (D,A)  where  1)  the 
domain — is  a  non-empty  set  and  A  is  an  assignment  of  appropriate  semantic  objects  to  the 
non-logica!  symbols  of  QFPC.  As  in  a  Tarskian  model,  A  assigns  to  every  n-ary  function 


symbol  a  function  from  Dn  to  D .  However,  unlike  a  Tarskian  Model,  A  assigns  to  every  n-ary 
predicate  symbol  a  function  from  Dn  to  {{True},  {False],  {True,  False}}.  In  comparing  RP  to 
T  the  difference  between  True  and  {True}  and  between  False  and  {False}  always  will  be 
ignored. 

The  exclusivity  of  True  and  False  is  built  implicitly  into  the  usual  semantic  equations  that 
determine  how  T  recursively  assigns  values  to  molecular  formulas.  Consider,  for  example,  (2.1) 
and  (2.2),  the  semantic  equations  for  disjunction  and  negation.  Here,  '  is  the  truth  value 

of  formula  <f>  relative  to  model  M  and  value  assignment  e.1 

RaV$]M,e  =True  if  [|a]]M'e  =True  or  =True  (2.1) 

=  False  otherwise 

[[-ia]]A!f,<!  =True  if  |a]]W’e  =False  (2-2) 

=  False  otherwise 

Notice  that  in  these  equations  the  assignment  of  False  is  based  on  the  non-assignment  of 
True.  Let  us  now  assume  that  formulas  can  be  assigned  a  set  of  values — {True}  and  {False}  in 
the  Tarskian  case — and  define  the  assignment  of  True  and  False  independently  of  each  other. 
(2.1)  and  (2.2)  can  be  written  equivalently  as  (2.3)  and  (2.4). 


TrueGflaV/^''  iff  True  £  |}a]]M''  or  True  £  [£]]"•• 

False  GlaVjSl"’*  iff  False  £  and  False  £ 

(2.3) 

True  £  [}-.«]] iff  False  £  Ja}]"'* 

F alse  £  iff  True  £  [{a]]M'' 

(2.4) 

The  semantic  equations  for  the  other  logical  connectives  can  be  rewritten  in  a  similar 
fashion,  or,  equivalently,  they  can  be  defined  in  terms  of  disjunction  and  negation.  For  exam¬ 
ple,  define 

A  =  \x,y.  -.(-izV-iy) 
or,  if  you  prefer, 

True  £  iff  True  6  [}aI]M’'  and  True  £  [}/?I]M’'  (2.5) 

F alse  £  [[aA/?]]W,e  iff  False  £  [Ja]]  M'e  or  F alse  £  [[/?]] 

Figure  2.1  displays  the  truth  tables  for  negation,  disjunction  and  conjunction  in  this 
three-valued  logic.  "T,”  "F,"  and  "TF"  abbreviate  the  names  of  the  truth  values  in  the  obvious 


1  Accommodating  value  assignments  at  this  point  facilitates  the  incorporation  of  quantifiers  in  Chapter  4 
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way. 

It  should  now  be  clear  that  these  semantic  equations — (2.3),  (2.4),  and  (2.5) — can  be  used 
to  assign  values  to  formulas  in  RP-models  as  well  as  in  T-models.  It  should  also  be  clear  that 
T-models  are  precisely  those  RP-models  where  no  atomic  sentence  is  assigned  {True,  False}. 
Hence,  as  one  would  expect,  each  three-valued  truth  table  contains  the  truth  table  for  the  two 
valued  Tarskian  logic. 

We  say  that  model  M  satisfies  sentence  a  if,  and  only  if,  True  6  [Ja]]W'e;  otherwise  M 
falsifies  a.2  Furthermore,  M  satisfies  a  set  of  sentences  iff  it  satisfies  each  sentence  in  the  set 
and  M  falsifies  a  set  of  sentences  iff  it  falsifies  some  sentence  in  the  set.  As  usual,  a  set  of  sen¬ 
tences  A  RP-entails  sentence  ft  iff  there  is  no  model  that  satisfies  A  and  falsifies  (3.  A  queried 
sentence  of  QFPC  is  retrievable  from  a  KB  of  QFPC-sentences  if,  and  only  if,  the  queried  sen¬ 
tence  is  RP-entailed  by  the  sentences  in  the  KB. 

Let  us  now  return  to  the  example  that  motivated  this  definition  of  RP  and  ask,  "Do  P  and 
->PVQ  RP-entail  Q?"  The  answer  is  "no",  because  there  are  RP-models  that  satisfy  both  P 
and  -i P.  For  example,  consider  model  M,  which  assigns  {True,  False}  to  P  and  {False,  to  Q. 
According  to  the  definitions  of  the  connectives,  M  assigns  {True,  False}  to  both  —>P  and 
-i P\/Q.  So,  M  satisfies  both  P  and  -\P\jQ  but  falsifies  Q,  and  therefore  is  a  countermodel  to 
the  claim  that  P,  -i PvQ  Q- 


2.4.  Properties  of  RP 

Now  that  RP  is  defined,  this  section  examines  some  of  its  properties.  The  preeminent 
results  of  this  section  show  several  logical  equivalences  for  the  RP  logic  and  show  that  the 
retrievability  relation  that  RP  defines  satisfies  the  Strength  and  the  No-Chaining  Criteria. 
These  results  lead  directly  to  the  decision  procedure  of  the  next  section.  However,  before  turn¬ 
ing  to  them,  we  first  observe  some  of  the  algebraic  properties  of  this  model  theory.  These  alge¬ 
braic  properties  prove  very  useful  both  in  examining  the  properties  of  direct  concern  and  in 
comparing  this  logic  to  others. 

First  note  that  the  three  truth  values  under  the  partial  order  of  set  inclusion  form  the 
upper  semi-lattice  shown  in  Figure  2.2.  Because  this  semi-lattice  forms  three  fourths  of 
Belnap’s  (1975;  1977)  A4  lattice,  I  call  it  A3  and  denote  its  partial  order  by  C . 
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It  doesn't  matter  what  t  is  since  q  is  a  sentence  and  hence  contains  no  variables 
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{True,  False} 

/  \ 

{True}  {False} 

Figure  2.2:  The  A3  Semi-Lattice 

As  usual,  an  n-ary  logical  connective  can  be  viewed  as  a  function  from  an  n-tuple  of  truth 
values  to  a  truth  value — in  this  case  C:A3B  — >A3.  C  3  can  be  extended  to  order  A3"  on  a 
point  by  point  basis;  i.e., 

(«i. •••»«.)  EA3.  iPn-M  iff  ^a3  Pi  for  a11 

Now,  examination  of  the  truth  tables  for  disjunction  and  negation  reveals  that  they  are  mono¬ 
tonic  functions;  that  is,  if  q,/?GA  3"  and  C  is  the  function  denoted  by  "V"  or  then 
c*C^3„ /f  implies  C(<?)  C(]3).  Considering  equations  (2.3)  and  (2.4),  this  observation  is  not 

surprising.  The  presence  of  True  or  False  in  the  assignment  to  a  formula  is  determined  solely 
by  the  presence,  never  the  absence,  of  True  or  False  in  the  assignment  to  its  subformulas. 
Finally,  all  logical  connectives  are  monotonic,  since  they  can  be  defined  as  compositions  of  dis¬ 
junction  and  negation. 

Further  examination  of  the  truth  tables  reveals  that  {True,  False)  is  a  fixed  point  for  both 
negation  and  disjunction,  and  therefore  a  fixed  point  for  all  the  logical  connectives. 

On  its  own,  A3  is  not  very  interesting.  However,  it  can  be  used  to  construct  complete 
upper  semi-lattices  of  RP-models.  This  construction,  and  several  others  encountered  in  this 
thesis,  require  sets  of  models  that  are  "compatible"  in  the  following  sense: 

Definition:  Compatible  Set  of  Models 

A  set  of  models  is  compatible  if,  and  only  if,  every  model  in  the  set  has  the  same  domain  and 
each  function  symbol  is  assigned  the  same  function  in  every  model. 

Several  points  deserve  explicit  mention:  A  set  containing  one  model  is  compatible,  as  is  the 
set  of  all  Herbrand  models.  Compatible  models  interpret  every  term  identically,  though  they 
may  differ  in  their  interpretations  of  the  atomic  sentences. 

A  maximal  set  of  compatible  RP-models  forms  a  complete  upper  semi -lattice,  called  an 
A3S  semi-lattice,  under  the  partial  ordering  C^3<;  defined  as: 

AfEA3SM'iff  doJ)A/,'C>13  r-’.G) 

for  every  atomic  formula  a  and  value  assignment  r 
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Here  is  a  simple  example  of  an  A3S  semi  lattice.  Consider  the  language  whose  lexicon  has 
only  three  symbols:  a  zero  place  function  symbol,  a,  and  two  one  place  predicates,  /’  and  Q. 
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Every  Herbrand  model  (D  ,A)  for  this  language  is  such  that  D={a)  and  A(a)=a.  The  Her- 
brand  models  differ  in  their  assignment  of  the  truth  values  to  P(a)  and  Q(a).  Accordingly,  a 
Herbrand  model  M  can  be  described  succinctly  by  the  pair  (dP(a)I]W’ti  tt§(a)I]W’f)-  So,  for 
example,  (TF,F)  is  the  Herbrand  countermodel  to  the  claim  that  P(a),  -i P(a)VQ(a)  Q(a)- 

Now,  Figure  2.3  shows  the  A3S  semi-lattice  of  the  Herbrand  models  for  this  language.  As  our 
examination  of  the  general  properties  of  A3S  semi-lattices  continues,  the  reader  may  fi  use¬ 
ful  to  check  all  claims  against  this  semi-lattice. 

The  minimal  elements  of  an  A3S  semi-lattice  are  precisely  the  Tarskian  models  of  the 
semi-lattice.  Each  A3S  semi-lattice  has  a  greatest  element,  T,  which  assigns  {True,  False!  to 
every  atomic  formula,  regardless  of  the  value  assignment.  Furthermore,  because  {True,  False, 
is  a  fixed  point  of  every  logical  connective,  T  assigns  {True,  False)  to  every  formula.  Therefore 
every  sentence  is  RP-satisfiable.  Here  for  the  first  time,  a  logical  property  has  fallen  out  of  the 
algebraic  properties. 

With  A3  and  A3S  now  in  place,  we  are  in  a  position  to  examine  the  connections  between 
them.  By  the  end  of  this  section  these  connections  will  prove  to  be  our  most  valuable  tool  in 
studying  the  properties  of  RP.  The  connections  between  a  semi-lattice  of  models  and  a  semi 
lattice  of  truth  values  are  made  by  sentences.  Associated  with  every  sentence  is  a  function, 
called  the  intension  of  the  sentence,  which  maps  each  model  to  the  value  of  the  sentence  in  that 
model.  Thus,  the  intension  of  sentence  a,  written  jja]],  is  defined  as 

M  =drt  ttalM  c--7> 

This  notion  can  be  extended  to  formulas  by  speaking  of  the  intension  of  a  formula  relative  to  a 
value  assignment.  Hence,  if  t/>  is  a  formula  and  e  is  a  value  assignment,  the  intension  of  v  rela¬ 
tive  to  e,  is  defined  by 

We=*/X A/.  C-’  *I 

The  application  of  an  intension,  say  [ja{]e,  to  a  model  M  is  written  as  "  [[e>  ^ r not  "|o|]f(A/|  . 
Intensions  have  some  important  properties  for  example,  the  following  simple  one: 

Monotonic  Intension  Lemma 

Let  t i'  be  any  formula  and  e  be  any  value  assignment.  Then  [jv]]f  is  monotonic  i.e., 

MQA3S  implies  QA3  flV’D”'' 

Proof 

Clearly  the  intension  of  every  atomic  formula  is  monotonic;  that  is  the  defining  characteristic 
of  (see  (2.6))!  The  logical  connectives  are  also  monotonic  functions.  Therefore  the  inten¬ 

sion  of  a  molecular  formula,  being  composed  of  the  intensions  of  atomic  formulas  and  connec¬ 
tives,  is  also  monotonic.  ■ 

Consequently,  if  M  satisfies  a  sentence  so  does  every  M1  such  that  ML  A/1.  Therefore,  a 
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corollary  to  the  above  theorem  is  that  every  T-tautology  is  also  an  RP-tautology. 

Every  non-Tarskian  model  in  an  A3S  semi-lattice  is  the  least  upper  bound  (l.u.b.)  of  a  set 
of  Tarskian  models.  If  S  is  an  A3S  semi-lattice  and  Y  C  A3S,  then  it  is  easy  to  see  that  the 
l.u.b.  of  Y  is  that  model  in  5  which  assigns  to  each  atomic  sentence  the  union  of  the  sentence’s 
values  in  all  the  models  of  Y.  Extending  this  to  atomic  formulas  as  well  as  atomic  sentences, 
this  observation  is  that  for  any  atomic  formula  a  and  value  assignment  e, 

Hary'e=U{H‘',ely€E}  (2.9) 

Any  Jar]]' — regardless  of  whether  a  is  atomic — that  satisfies  this  condition  is  said  to  be  com¬ 
pletely  additive.3  Complete  additivity  can  be  divided  into  two  component  properties:  [jaj]r  is 
completely  additive  iff  it  is  both 

completely  t-additive:  True C  |Ja|]Ljy,e  iff  TrueG  U  !  naD*'''’  ly  G  T  1. 

for  every  Y  that  is  a  subset  of  some  A3S  lattice,  and 

completely  f-additive:  False  G  |aju  Y'e  iff  False  G  U  !  JJa]]v'' ly  C  >"  1 , 

for  every  Y  that  is  a  subset  of  some  A3S  lattice. 

The  following  lemma  characterizes  some  formulas  whose  intensions  are  completely  additive. 


Complete  Additivity  Lemma 

Let  a  and  (3  be  formulas  and  e  be  a  value  assignment.  Then: 

(1)  If  a  is  atomic  then  [[a]'  is  completely  additive. 

(2)  If  Ho  A'  completely  t-additive  then  [J  — io  || r  is  completely  f  additive. 

(3)  If  [Jar]] ,  is  completely  f-additive  then  jj~>a]]r  is  completely  t  additive. 

(1)  If  [Jo]]'  and  are  completely  t  additive  then  so  is  [jaV,?]jf. 

(5)  If  UaJJ'  and  M'  are  completely  f-additive  then  so  is  J]oA.^J]r. 


Proof 

(1) 

Obvious.  Previously  stated  as  (2.9). 

(2) 

False*'  [J  — >cv]|  iff  True  G  |or]]  J* 

(def.  of  -i) 

iff  True  G  U  !  d0]} V  ^  ly  €  Y  | 

(assumption 

iff  for  some  y  G  T,  True  G  jjaj],,'f 

(def.  of  A3) 

iff  for  some  y  G  Y,  F alse  G  j] — •«]] w  r 

(def.  of  -i) 

iff  False  GLKbaf-'lyGr) 

(def.  of  A3) 

(3) 

Similar  to  (2). 

(4) 

True  G  [Jor V/?J] J  * 

iff  Truey  |ja]]' ,e  or  Truer  |/?]]J*'' 

(def.  of  V) 

1  This  term  is  used  by  some  authors  (e  g  ,  Sanderson  (1973))  while  others  use  "meet  homomorphic"  or  upper 

homomorphic 


iff  True  GU{Iaf'e  ly  €  Y}  or  True  eU{M1,’e  ly  6  Y\  (assumption) 

iff  for  some  y  £  F,  True  £  or  some  y  £  F,  True  £  [|/3]j1,’e  (def.  of  A3) 

iff  for  some  y  £  F,  True  £[[a:]]t',e  or  True  £[[/?]] v,e  (metalogicai) 

iff  for  some  y  £  Y,  True  £  flaV/?]]v  (def.  of  V) 

iff  True£U{0<*V/3]]v'e  ly  £  Y}  (def.  of  A3) 

(5)  Similar  to  (4).  ■ 


Not  every  intension  is  completely  additive.  For  example,  one  can  observe  that  the 
intension  of  P  AQ  is  not  completely  t-additive  by  considering  the  simple  case  in  which: 

Y  =  {MvMt} 

[[Pl|W,'e  =  True  MUv'  =  False 

[Pi"”'  =  False  =  True 

Consequently, 

[[Pl]uy'e  -  {True, False}  -  {True, False} 

iPAQf Y’e  =  {True,  False} 

U { [JP A ^ I v,<?  ly  €  F}  =False 

So,  [[P* AQ]]uy’e  contains  True  but  LJ { [[P* A Q]]*'’*  ly  £  T}  does  not,  thereby  demonstrating 
that  the  intension  of  PaQ  is  not  completely  t-additive.4 

Fact  Intension  Theorem 

Every  fact  has  a  completely  t-additive  intension. 

Proof 

The  Theorem  follows  immediately  from  the  Complete  Additivity  Lemma.  By  statement  (1), 
the  intension  of  every  atomic  formula  is  completely  additive,  and  thus,  by  (2)  and  (3),  so  is 
the  intension  of  every  literal.  Therefore,  by  (4),  a  disjunction  of  literals  must  have  a  com¬ 
pletely  t-additive  intension.  ■ 

We  now  turn  directly  to  the  logical  properties  of  RP.  As  in  two-valued  logic,  it  is  often 
convenient  to  restrict  our  attention  to  the  Herbrand  models.  This  selective  attention  is 
licensed  by  the  Herbrand-Model  Lemma,  which  says  that  for  certain  purposes  the  Herbrand 
models  are  representative  of  all  models.  This  lemma  holds  in  RP  for  precisely  the  same  rea- 


*  Notice  that  Y  is  not  a  directed  set  so  this  example  does  not  demonstrate  that  the  intension  of  P/\Q  is  not 


continuous 
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sons  that  it  holds  in  a  two-valued  logic,  so  it  is  stated  here  without  proof.5 

Herbrand— Model  Lemma 

Let  A  and  B  each  be  a  set  of  sentences.  If  no  Herbrand  RP-model  satisfies  A  and  falsifies  B 
then  A  bj^p  B  .6 

Since  the  set  of  all  Herbrand  models  is  compatible,  it  forms  an  A3S  semi-lattice  that, 
for  certain  purposes,  is  representative  of  all  A3S  semi-lattices.  Thus,  for  example,  one  can 
examine  the  semi-lattice  of  Figure  2.3  in  order  to  verify  that  &\ F(a)V-iF(ii)  and  that 
P{a)  f^pP(a)V<2(a). 

It  is  now  easy  to  prove  that  RP  meets  the  No-Chaining  and  Strength  Criteria. 

No— Chaining  Theorem 

Let  $  be  a  set  of  sentences  and  a  be  a  fact.  Then  4>  b^p  a  iff  for  some  <f>  (E4>  <p  b^jp  a. 

Proof 

If  clause:  Obvious. 

Only-if  clause:  Assuming  that  there  is  no  such  that  <p  b^pQ,  I  construct  a  model  M* 

that  satisfies  every  <p£<t>  but  falsifies  a.  From  the  assumption  it  follows  that  for  every  <££4> 
there  is  a  Herbrand  RP-model,  that  satisfies  <f>  and  falsifies  a.  Let  M*  =|J  €$!• 

Now,  for  every  <£C4>,  Mtt>QA3S  M*  and  therefore,  by  the  Monotonic  Intension  Lemma,  M* 
satisfies  <f>.  Moreover,  True£  because  each  M ^  falsifies  a;  therefore,  since  a 

is  a  fact,  the  Fact  Intension  Theorem  assures  us  that  True£  [[a]]M  .  ■ 

From  here,  the  presentation  could  proceed  in  two  ways,  each  building  on  the  T-Decision 
Theorem  for  Facts.  One  approach  would  proceed  by  first  proving  the  RP-Decision  Theorem 
for  Facts,  which  says  that  RP-entailment  between  facts  can  be  decided  in  the  same  way  that 
the  T-Decision  Theorem  for  Facts  says  to  decide  T-entailment.  The  Strength  Theorem 
would  then  follow  immediately.  The  other  approach  would  begin  by  proving  the  Strength 
Theorem,  which  then  yields  the  RP-Decision  Theorem  for  Facts  as  an  immediate  conse¬ 
quence.  However,  since  both  the  RP-Decision  Theorem  for  Facts  and  the  Strength  Theorem 
are  interesting  in  their  own  right,  and  since  both  have  simple  proofs,  I  take  a  third  approach 
by  proving  each  theorem  independently. 

Strength  Theorem 

If  <(>  is  a  fact  and  ip  is  a  sentence  then  <f>  b^p  ip  iff  <p  «/•'• 


1  Robinson  (1979)  gives  an  excellent  exposition  of  Herbrami  models 
*  Because  this  is  the  propositional  case,  Skolem  Normal  Form  is  not  an  issue 
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Proof 

Only-if  clause:  Immediate  since  every  T-model  is  an  RP-model. 

If  clause:  Assuming  <p  xp  and  that  M  is  an  RP-model  satisfying  <p,  I  show  that  M  satisfies 
ip.  Since  M  is  the  l.u.b.  of  a  set  of  T-models,  the  Fact  Intension  Theorem  implies  that  there 
must  be  some  T-model  M'QA3SM  that  satisfies  <p.  Furthermore,  since  <p  ip,  M'  must  also 
satisfy  ip.  Therefore,  by  the  Monotonic  Intension  Lemma,  M  must  satisfy  t p.  ■ 

RP-Deciflion  Theorem  for  Facts 

For  any  facts,  <p±  and  <p2,  <px  <p2  iff  either 

(1)  <p2  has  complementary  literals,  or 

(2)  LITERALS (^)  C  LITERALS{4>2). 


Proof 

If  clause:  Assume  Condition  (2).  From  the  definition  of  V  observe  that  an  RP-model 
satisfies  a  disjunction  iff  it  satisfies  one  of  the  disjuncts.  Therefore,  any  RP-model  satisfying 
d>j  also  satisfies  <p2\  i.e.,  <Pl  bjjp  <p2.  On  the  other  hand,  assume  Condition  (1).  From  the  RP- 
truth  table  for  — i  observe  that  each  literal  or  its  complement  is  satisfied  by  any  RP-model. 
Thus,  (by  the  argument  above)  any  disjunction  containing  complementary  literals  is  satisfied 
by  every  model,  and  therefore  <pl  <p2. 

Only-if  clause:  Immediate  since  every  T-model  is  an  RP-model.  ■ 


Notice  that  this  theorem  and  the  proof  of  its  if-clause  are  identical  to  the  T-Decision 
Theorem  for  Facts  and  the  proof  of  its  if-clause  (except  of  course  that  one  concerns  T  and 
the  other  RP).  Moreover,  the  only-if-clause  of  this  theorem  could  be  proved  by  mimicking 
the  proof  of  the  only-if-clause  of  the  T-Decision  Theorem  for  Facts. 

This  section  concludes  by  presenting  some  RP-equivalences.  However,  we  first  examine 
the  notion  of  equivalence  itself,  which  must  be  handled  with  care  when  working  with  a 
multi-valued  logic. 

In  general,  two  notions  can  be  distinguished:  mutual  entailment  (=),  defined  in  (2.10). 
and  equivalence  (^),  defined  in  (2.11). 


a  =/3  iff  a  H/?  and  0  Ha 

a=0  iff  fla]]W'e  for  all  M  and  e. 


(2.10) 


(2.11) 


*  jVjV"  O  „■  .  .  - 


•* .  «V*\  0 
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The  relationship  between  these  two  definitions  becomes  clear  if  they  are  rewritten  as 

a=P  iff  for  all  M  and  e,  True  G  H"’*  iff  True  (2.10') 

a=0  iff  for  all  M  and  e,  TrueGH^'  iff  True  6  pf’''  and  (2.11') 

False  GHM>e  iff  False  G  d/?]jM'e' 

Notice  that  (2.10')  defines  equivalence  over  all  formulas  whereas  (2.10)  only  defines  it  over  the 
sentences. 

Trivially  in  T,  and  non-trivially  in  some  multi-valued  logics  (Belnap,  1975;  1977),  these 
two  definitions  coincide.  However,  in  RP  they  do  not.  This  is  demonstrated  by  the  sentences 
Py-iP  and  QV-iQ,  which  (mutually)  RP-entail  each  other,  but  are  not  RP-equivalent. 
Since  each  of  these  two  sentences  is  an  RP-tautology,  every  model  assigns  each  of  them  a 
value  containing  True.  However,  in  a  model  that  assigns  {True,  False}  to  P  and  {True}  to 
Q,  Psy-iP  is  assigned  {True,  False}  while  QV~iQ  is  assigned  {True}. 

The  choice  of  whether  to  use  mutual  entailment  or  equivalence  depends  on  one’s  pur¬ 
poses.  This  thesis  is  often  concerned  with  substituting  equals  for  equals  and  is  therefore  con¬ 
cerned  with  the  notion  of  equivalence. 


RP-Equivalence  Theorem 

If  a,  /?,  and  rp  are  formulas  of  QFPC  then 


(1) 

aVck  —jip  ol 

(2) 

aAck  —ftp  c* 

(3) 

“i*i  Or  =ftp  ot 

(4) 

aA(PV4’)  =rp  (aA/?)V( 

aAip) 

(5) 

aV(/?AV')  =rp  (<*V/?)A( 

0 

< 

•€- 

(6) 

c*V/?  =RP  — '(-ictA- >/?) 

(7) 

af\P  =RP  -i(-iarV-v9) 

Proof 

Construct  the  truth  tables.  ■ 


2.5.  The  Retrieval  Algorithm 

In  certain  cases  it  is  simple  to  see  how  retrievability  can  be  decided.  The  Fact  Decision 
Theorem  for  RP  provides  a  method  for  deciding  whether  or  not  a  query  consisting  of  a  single 
fact  is  retrievable  from  a  KB  containing  a  single  fact.  The  No  Chaining  Theorem  extends 
the  method  to  the  case  where  the  KB  contains  any  finite  number  of  facts.  Finally,  the  seman¬ 
tics  of  conjunction,  which  says  that  an  RP-model  satisfies  a  conjunction  iff  it  satisfies  each 
conjunct,  provides  for  the  case  where  the  KB  contains  conjunctions  of  facts  and  the  query  is  a 
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conjunction  of  facts.  Hence,  retrieval  problems  based  on  conjunctions  of  facts  play  a  special 
role  in  the  analysis. 

Formulas  that  are  conjunctions  of  disjunctions  of  literals  are  traditionally  said  to  be  in 
conjunctive  normal  form  (CNF).  The  use  of  CNF  in  the  decision  procedure  divides  the 
remainder  of  this  section  naturally  into  two  subsections.  The  first  is  concerned  with 
transforming  arbitrary  QFPC  formulas  into  CNF  while  the  second  is  concerned  with  the 
details  of  deciding  if  a  CNF  query  is  retrievable  from  an  KB  of  CNF  sentences. 

The  sections  and  chapters  that  follow  use  some  terminology,  which  is  now  introduced. 
Since  every  retrieval  problem  is  characterized  by  the  set  of  sentences  in  the  KB  and  the 
queried  sentence,  I  introduce  into  the  meta-language  objects,  called  sequents,  that  encode 
such  characterizations.  A  sequent  is,  quite  simply,  a  pair  whose  first  element  is  a  set  of  sen¬ 
tences  and  whose  second  is  a  single  sentence.  According  to  this  definition  a  sequent,  like  a 
pair  of  numerals,  is  merely  a  syntactic  object;  it  has  no  assertional  import;  it  is  neither  true 
nor  false.  A  sequent  composed  of  kb  and  q  is  written  as  " kb=>q ";  kb  is  called  the  antecedent 
of  the  sequent,  and  q  is  called  the  succedent  of  the  sequent.  In  writing  a  sequent  such  as 
{P(a),Q(a))=»./?(a)  I  usually  omit  the  set  signs  and  simply  write  P(a),  Q(a)=$R[a). 

A  sequent  is  said  to  be  finite  or  infinite  according  to  the  number  of  sentences  that  it  con¬ 
tains.  Both  finite  and  infinite  sequents  are  used  frequently  throughout  the  remainder  of  this 
thesis  though,  of  course,  retrieval  problems  always  correspond  to  finite  sequents.  This 
chapter  is  concerned  only  with  sequents  of  QFPC,  i.e. ,  sequents  containing  only  sentences  of 
QFPC.  When  we  examine  retrievers  operating  on  other  languages,  sequents  of  those 
languages  will  be  used. 

Finally,  I  stretch  the  terminology  slightly  and  say  that  a  sequent  kb=$q  is  in  an  entail- 
ment  relation,  by  which  1  mean  that  kb  entails  q.  Similarly,  I  say  that  a  sequent  is  in  a 
retrievability  or  a  provability  relation. 

2.5.1.  The  Conjunctive  Normal  Transformation 

The  conjunctive  normal  transformation  (CNT)  is  a  well-known  algorithm  for  converting 
any  QFPC  formula  to  a  T-equivalent  CNF  QFPC  formula.  After  this  transformation  is 
presented,  a  theorem  will  state  that  the  transformation’s  output  is  also  RP-equivalent  to  its 
input. 


Definition:  Conjunctive  Normal  Transformation 

Input:  An  arbitrary  formula  of  QFPC. 

Output:  A  CNF  formula  of  QFPC. 

(1)  Eliminate  all  occurrences  of  — ►  and  «-►  by  the  following  rules: 

Rewrite  ip—*<p  to  —>ipy<p 
Rewrite  to  (-npA-><f>)\/(ipA<j>) 

(2)  By  the  following  rules  push  all  occurrences  of  ->  inward  so  that  only  atomic  sentences 
are  negated: 

Rewrite  -i -up  to  ip 

Rewrite  -•(V'iV  •  *  *  W>»)  to  -u/^A  *  *  *  A~"Pn 
Rewrite  ->(t/»1A  •  *  *  A ipn)  to  -1 <ipx V  •  •  '  V-* Pn 

(3)  With  the  following  rule,  distribute  V  over  A  so  that  no  A  occurs  within  the  scope  of  an 
V: 

Rewrite  ipV{<f>iA  *  •  '  A<pn)  to  (t/A/^A  *  •  *  A(ip\/<pn) 

Conjunctive-Normal- Transformation  Theorem 

Every  QFPC  formula  is  RP-equivalent  to  its  conjunctive  normal  transform. 

Proof 

Each  of  the  rules  rewrites  a  formula  to  an  RP-equivalent  formula;  hence  the  transformation 
as  a  whole  rewrites  sentences  to  RP-equivalent  sentences.  The  RP-equivalences  that  justify 
the  rewrite  rules  of  step  (1)  follow  from  the  definitions  of  — *  and  <-►,  while  the  others  are 
from  the  RP-Equivalence  Theorem.  ■ 

A  sequent  is  in  CNF  iff  all  of  its  sentences  are.  The  conjunctive  normal  transformation 
can  be  used  to  put  arbitrary  sequents  into  CNF  merely  by  replacing  every  formula  in  the 
sequent  with  its  transform.  Because  this  operation  replaces  equals  with  equals,  the  resulting 
sequent  is  in  the  RP-entailment  relation  iff  the  original  one  is.  More  concisely,  I  say  that  the 
CNT  preserves  RP-entailment. 

2.5.2.  Retrievability  of  CNF  Sequents 

As  stated  at  the  outset  of  Section  2.5,  the  RP-Decidabi!ity  Theorem  for  Facts,  the  No- 
Chaining  Theorem  and  the  semantics  of  conjunction  make  clear  how  to  decide  for  CNF 
sequents  of  QFPC.  Nevertheless  this  section  presents  a  decision  algorithm,  not  because  it  is 
of  intrinsic  interest,  but  because  it  lays  the  foundation  for  the  more-interesting  algorithms  of 
the  chapters  that  follow.  The  algorithm  is  called  the  "Ground  Sequent  Retrieval  Algorithm" 
or  simply  the  "GSRA  ";  the  word  "ground"  is  used  to  distinguish  this  algorithm,  which 
operates  on  the  sentences  of  QFPC,  from  the  algorithm  of  the  next  section,  which  operates  on 
sentence  schemas. 


Ground  Sequent  Retrieval  Algorithm 

Input:  kb=>q,  a  CNF  sequent  of  QFPC. 

Output:  SUCCESS  or  FAILURE 

(1)  let  s  =  number  of  conjuncts  in  q 

(2)  let  qx  =  itk  conjunct  of  q  (l<i<s) 

(3)  let  K  be  the  set  containing  every  conjunct  of  every  sentence  in  kb 

(4)  for  t  =  1  to  s  do 

(5)  choose  to  do  either  step  A)  or  step  B) 

(6)  A)  choose  px,  a  positive  literal  in  q{ 

(7)  choose  n(,  the  complement  of  a  negative  literal  in  q , 

(8)  if  n,  =  p, 

(9)  then  continue 

(10)  else  FAIL 

(11)  B)  choose  bx  £K 

(12)  let  /,  j /,  m  be  the  literals  of  b, 

(13)  choose  Fx,  a  total  function  from  LITERALS(bt)  to  LlTERALS(qx) 

(14)  \f(litl,...,li<m)  =  (Fi(l,A),...,Fi(l„m)) 

(15)  then  continue 

(16)  else  FAIL 
(17  j  SUCCEED 

The  "choose"  statements  in  this  algorithm  make  non-deterministic  choices,  which  means 
that  the  algorithm  succeeds  iff  there  is  some  sequence  of  choices  that  leads  to  the 
"SUCCEED"  statement  at  the  end.  If  a  choice  must  be  made  from  an  empty  set  of  options 
then  the  execution  fails.  This  can  happen  on  line  (6)  if  qx  does  not  contain  both  a  positive 
and  a  negative  literal,  or  on  line  (11)  if  K  =0. 

This  interpretation  of  the  choose  statements  motivates  the  following  definition  of  prova¬ 
bility.  We  shortly  shall  see  the  justification  for  using  the  term  "provable." 

Definition:  GSRA-Provability 

Let  kb=*q  be  a  CNF  sequent  of  QFPC.  Then,  q  is  GSRA-provable  from  kb  (written 
kb  q)  iff  there  is  some  sequence  of  choices  for  which  the  Ground  Sequent  Retrieval  Algo¬ 
rithm  halts  with  SUCCESS  when  input  kb=^q. 

The  following  theorem  states  that  the  GSRA  meets  the  retrieval  specification  of  Section 
2.3;  that  is,  RP-entailment  and  GSRA-provability  are  one  and  the  same.  Armed  with  the 
results  of  Section  2.4,  the  proof  of  this  theorem  is  straightforward. 
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GSRA  Correctness  Theorem 

Let  kb=^q  be  a  CNF  sequent  of  QFPC.  Then,  kb  q  iff  kb  q. 

Proof 

The  GSRA  succeeds  iff  every  iteration  of  the  loop  (with  i  ranging  from  1  to  s)  succeeds.  The 
*<n  iteration  succeeds  iff  either  step  (A)  or  step  (B)  succeeds.  Step  (A)  succeeds  iff  qi  has  com¬ 
plementary  literals  and  step  (B)  succeeds  iff  for  some 
6,  €  A",  LITERALS{bi)CLITERALS{qi).  Thus,  by  the  Fact  Decision  Theorem  for  RP,  the 
ith  iteration  succeeds  iff  6,  hjjp  q,  for  some  bt EK.  By  the  No-Chaining  Theorem  this  hap¬ 
pens  precisely  when  K  qi.  The  semantics  of  A  says  that  a  conjunction  is  satisfied  iff  each 
of  its  conjuncts  is.  Therefore,  K  qit  for  l<t<s,  iff  K  f^p  q  iff  kb  l^p  q.  ■ 

Not  only  is  the  GSRA  correct,  but  when  executed  with  a  finite  input  each  of  its  steps  is 
effectively  computable  and  the  algorithm  terminates.  Crucial  to  this  claim  is  that,  when 
working  with  a  finite  input,  every  non-deterministic  choice  is  made  from  a  finite  set  of 
options  that  can  be  effectively  constructed.  For  instance,  in  choosing  an  Fi  in  line  (12)  the  set 
of  all  total  functions  from  LITERALS(bi)  to  LITERALS(qi)  can  be  constructed.  If  6,  has  r 
literals  and  qt  has  s  literals  then  there  are  sr  such  functions,  each  of  which  can  be  finitely 
represented.  Because  all  choices  are  made  from  finite  sets  a  deterministic  machine  can  exe¬ 
cute  this  non-deterministic  algorithm  simply  by  trying  all  combinations  of  choices. 

A  survey  of  all  combinations  of  choices  is,  of  course,  a  search  space.  A  search  space  can 
be  displayed  as  a  tree  in  which  each  node  represents  a  choice  that  must  be  made  and  the  arcs 
emanating  from  a  node  represent  all  of  the  options  available  and  are  labeled  as  such.  The 
node  at  the  end  of  an  arc  represents  the  next  choice  that  must  be  made  after  that  arc  is 
chosen.  The  first  choice  that  an  execution  of  the  GSRA  encounters  is  located  at  the  root  of 
the  tree.  Every  path  originating  from  the  root  represents  an  initial  sequence  of  choices  that 
an  execution  could  make.  Hence,  a  path  from  the  root  to  a  leaf  represents  the  sequence  of 
choices  made  during  some  complete  execution  of  the  program  and  the  leaf  is  labeled 
"SUCCEED"  or  "FAIL”  according  to  the  outcome  of  that  execution. 

Figure  2.4  displays  the  search  space  implicitly  defined  when  the  GSRA  is  executed  on 
the  input  sequent  P,  R=*(P\/ Q)/\(RV~> R)-  Because  some  of  the  leaf  nodes  are  labeled 
"SUCCEED",  the  algorithm  successfully  retrieves  (FvQ)A(R V~>R)  form  the  KB  containing 
P  and  R. 

A  consequence  of  the  previous  discussion  is  that  every  finite  sequent  has  a  finite  search 
space.  In  attempting  to  retrieve  a  sentence  with  s  conjuncts  the  GSRA  iterates  at  most  s 
times,  making  three  choices  on  each  iteration.  Hence,  a  search  space  has  a  depth  of  at  most 
3s.  Furthermore,  the  tree  is  finitary  since  all  choices  are  made  from  a  finite  number  of 
options. 
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Search  spaces  in  the  style  of  Figure  2.4  can  be  summarized  in  another  style  of  display. 
Observe  that  Subtree  I  in  Figure  2.4  displays  the  following:  an  unsuccessful  attempt  to  show 
that  PVQ  has  complementary  literals  (terminating  at  node  [2]),  two  unsuccessful  attempts  to 
show  {R}C  {P ,  Q }  (terminating  at  nodes  [8]  and  [9]),  an  unsuccessful  attempt  to  show 
{P}C{P,<2}  (terminating  at  node  (7j),  and  a  successful  attempt  to  show  {P}C{P,Q }. 
Each  of  the  unsuccessful  attempts  corresponds  to  an  unsuccessful  attempt  to  apply  an  infer¬ 
ence  rule  in  a  certain  manner.  Traditionally,  deductive  search  spaces  display  only  successful 
rule  applications,  suppressing  the  failed  attempts.  Of  course,  a  successful  rule  application  is 
not  necessarily  part  of  a  successful  proof. 

From  here  on  I  follow  this  lead  by  displaying  only  those  choices  that  allow  execution  to 
continue  to  the  end  of  the  iteration  in  which  they  occur.  So,  only  that  part  of  Subtree  I  con¬ 
taining  nodes  [l],  [3],  [4]  and  [6]  is  of  interest.  Since  this  represents  the  three  choices  made  on 
a  particular  iteration,  they  are  collapsed  into  a  single  arc  labeled  with  the  three  choices.  Fol¬ 
lowing  this  convention,  Figure  2.4  can  be  redisplayed  as  Figure  2.5.  Notice  that  arcs  1-6,  6- 
11,  and  6-15  of  Figure  2.5  summarize  respectively  Subtrees  I,  II,  and  III  of  Figure  2.4. 

The  discussion  has  brought  us  to  an  appropriate  point  for  making  a  rare  comment 
about  implementation  issues.  A  good  implementation  of  the  GSRA  could  indeed  avoid  many 
of  the  unsuccessful  choices  that  are  suppressed  in  the  condensed  style  of  search  space.  For 
example,  in  step  (B),  6,  is  chosen  and  then  a  test  is  made  to  see  if 
LITERALS(b{)  C  LITERALS (?,).  An  implementation  could  exploit  an  indexing  scheme  for 
accessing  all  elements  of  K  that  pass  the  subset  test  while  avoiding  those  that  fail. 

This  chapter  has  progressed  from  a  model  theory  directly  to  an  algorithm,  avoiding 
proof  theory  in  its  traditional  form  of  axioms  and  inference  rules.  This  is  appropriate  in  a 
study  of  knowledge  retrieval,  which  is,  after  all,  a  computational  process.  The  simplicity  of 
RP  allowed  for  a  smooth  transition  from  the  model  theory  to  the  algorithm. 

Though  this  study  has  no  need  for  a  traditional  proof  theory,  proofs  themselves  are  use¬ 
ful  objects.  While  the  retrieval  algorithm  can  determine  what  follows  from  what,  a  proof 
provides  an  argument  that  it  does  follow.  Furthermore  it  is  often  valuable  to  examine  the 
properties  of  proofs,  such  as  their  size,  and  relationships  between  proofs,  such  as  the  so-called 
lifting  relationship,  which  is  used  in  the  next  chapter.  Fortunately,  we  already  have  at  hand 
constructions  that  can  naturally  be  called  proofs;  a  proof  is  a  path  through  a  search  space 
originating  at  the  root  and  terminating  at  a  leaf  node  labeled  "SUCCEED".  This  definition  is 
appropriate  because  a  claim  that  a  certain  path  is  a  proof  of  a  certain  sequent  can  be  readily 
tested. 

One  last  issue  deserves  attention.  Recall  that  the  GSRA  works  for  any  query  q  in  CNF; 
in  particular,  it  works  for  any  query  obtained  by  permuting  the  conjuncts  of  q.  Many  prob¬ 
lem  solvers,  including  theorein-provers,  use  a  selection  function  to  determine  what  order  to 
work  on  the  components  of  a  conjunctive  goal.  The  GSRA  could  employ  a  selection  function. 


At  the  outset  of  each  iteration  it  would  select  one  conjunct  of  q  from  those  that  have  not 
been  selected  previously.  It  is  clear  that  the  correctness  of  the  GSRA  is  independent  of  what 
selection  function  is  used. 

Though  the  order  in  which  the  conjuncts  of  a  query  are  retrieved  does  not  affect  the 
correctness  of  the  GSRA,  it  can  affect  the  search  space.  The  obvious  example  is  of  a  query 
that  fails.  The  sooner  that  a  failing  conjunct  is  selected,  the  smaller  the  resulting  search 
space.  Figure  2.6  shows  the  search  space  for  Q^(PVQ\/-'P)A—iQ  while  the  search  space  for 
Q=*-'QA(PVQV-'P)  consists  of  a  single  node  with  no  arcs. 

Even  for  successful  queries  the  order  matters.  By  reordering  P,  R=>(PvQ)A(R V~>R) 
to  P,  R=$(R y-iR )A{PVQ)  the  search  space  of  Figure  2.5  is  transformed  to  that  of  Figure 
2.7.  The  modifications  caused  by  reordering  successful  queries  are  inconsequential  because,  at 
least  presently,  we  can  be  content  with  finding  one  proof,  a  task  whose  difficulty  is  indepen¬ 
dent  of  goal  ordering. 

Not  only  can  the  conjuncts  of  a  query  be  reordered,  but  they  can  be  retrieved  indepen¬ 
dently  of  each  other.  The  latter  is  a  stronger  property  that  entails  the  first.  The  stronger 
property  obtains  as  a  consequence  of  the  semantic  definition  of  conjunction,  which  implies 
that 

kb  qxA  •  •  •  A qn  iff  kb  and  •  •  •  and  kb  \=RP  qn 

for  any  sequent  kb=>qlA  •  •  •  A Furthermore,  if  this  sequent  happens  to  be  in  CNF,  then 
the  GSRA  Correctness  Theorem  implies  that 

kb  IcsfiA  9iA  A qn  iff  kb  IGsra  ?i  ant^  '  '  '  ar)d  kb  qn 

Alternatively,  this  property  can  be  observed  by  direct  examination  of  the  GSRA.  Each  itera¬ 
tion  of  the  algorithm  is  independent  of  all  previous  iterations;  nothing  computed  during  one 
iteration  is  carried  forward  for  use  during  future  iterations.  Problems  that  can  be  divided 
into  independent  subproblems  the  way  that  CNF  retrievability  can  are  called  "decomposable" 
and  an  algorithm  that  decomposes  such  problems  confronts  an  AND/OR  search  space  (Nils¬ 
son,  1980). 

Of  the  two  properties,  selection-function  independence  and  decomposability,  the  latter 
is  more  important  in  constructing  a  retriever  for  QFPC  sequents  that  is  efficient.  Yet  this 
chapter  now  concludes  having  examined  selection  functions  more  closely  than  decomposition. 
This  has  been  done  in  an  attempt  to  lay  the  groundwork  for  examining  the  retrievers  of  the 
following  chapters,  all  of  which  are  independent  of  selection  function  though  none  are  decom¬ 
posable. 
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Chapter  3 


A  Retriever  that  Handles  Wh-Queries 

The  retriever  specified  in  the  previous  chapter  responds  to  queries  merely  by  indicating 
success  or  failure.  In  this  sense,  such  queries  correspond  to  yes/no  questions  in  English.  Access¬ 
ing  a  KB  with  this  retriever  would  be  like  playing  a  game  of  Twenty  Questions.  This  chapter 
specifies  a  retriever  that  responds  to  certain  queries  by  supplying  a  set  of  answers.  Such  queries 
are  more  like  English  wh-questions. 

The  specification  of  this  retriever  generalizes  the  specification  of  the  last  chapter  by  allow¬ 
ing  sentence  schemas  of  QFPC  as  queries  and  as  elements  of  the  KB.  Thus,  we  are  now  con¬ 
cerned  with  retrieval  problems  that  are  characterized  by  finite  schematic  sequents  of  QFPC. 
This  chapter  defines  what  schematic  sentences  are  and  specifies  a  retrievability  relation  for 
schematic  sequents  in  terms  of  the  RP-entailment  relation  for  non-schematic  sequents. 

3.1.  An  Approach  to  Wh-Queries 

Let  us  consider  the  rather  obvious  way  that  the  retriever  of  the  last  chapter  can  be  used  in 
answering  a  yes/no  question.  Then  by  analogy  we  can  reason  about  what  capabilities  a  retri¬ 
ever  would  need  in  order  for  it  to  be  used  in  answering  a  wh  question.  Though  the  analogy 
proceeds  by  considering  a  sequence  of  English  sentences,  I  am  not  making  any  linguistic  claims. 

Suppose  we  wished  to  answer  the  question 

Does  Roth  sell  expert  systems?  (3.1) 

Ideally  we  would  like  to  turn  to  the  retriever  and  issue  the  imperative 

Tell  me  whether  "Roth  sells  expert  systems"  is  true.  (3.2) 

but,  of  course,  the  best  that  the  retriever  can  do  is  respond  to 

Tell  me  whether  "Roth  sells  expert  systems"  is  retrievable.  (3.3) 

Indeed,  this  is  what  the  retriever  does  when  given  the  query  "Roth  sells  expert  systems" 
(expressed  in  the  logical  language,  of  course).  In  general,  we  give  the  retriever  a  queried  sen¬ 
tence  q  and  in  doing  so  issue  the  command 

Tell  me  whether  q  is  retrievable.  (3.4) 

Now  let  us  consider  wh-queries  in  an  analogous  way.  Suppose  that  we  wish  to  answer  the 
question 
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which  corresponds  to  the  imperative 

Name  every  z  and  y  such  that  z  sells  y.  (3-6) 

However,  the  form  of  this  imperative  bears  little  resemblance  to  the  form  of  its  analog,  (3.2). 
The  former  imperative  contains  a  quoted  sentence  and  asks  for  a  report  on  its  truth.  Putting 
(3.6)  in  this  form  results  in 

Name  every  pair  of  terms  such  that  "x  sells  y"  is  a  true  sentence  when  the  (3.7) 

first  element  of  the  pair  is  substituted  for  x  and  the  second  element  of  the 
pair  is  substituted  for  y. 

The  awkwardness  of  this  request  is  caused  by  the  need  to  correlate  the  entries  in  the 
requested  pairs  with  x  and  y.  Some  of  this  can  be  avoided  by  asking  for  a  substitution  rather 
than  a  pair.  That  is,  instead  of  asking  for  pairs  such  as  (Roth,  expert  systems)  and  (Wilson, 
designer  drugs)  it  is  simpler  to  ask  for  substitutions  such  as  "substitute  ’Roth’  for  x  and  ’expert 
systems’  for  y”  and  "substitute  ’Wilson’  for  x  and  ’designer  drugs’  for  y”.  By  asking  for  substi¬ 
tutions,  the  answer  itself  supplies  the  necessary  correlations.  Following  this  strategy,  (3.7)  can 
be  rephrased  as 

Name  every  substitution  for  z  and  y  whose  application  to  "z  sells  y"  results  (3-8) 

in  a  true  sentence. 

However,  as  before,  the  best  that  the  retriever  can  do  is  respond  to  the  command 

Name  every  substitution  for  z  and  y  whose  application  to  "z  sells  y"  results  (3.9) 

in  a  retrievable  sentence. 

In  general,  we  give  the  retriever  a  sentence  schema  q  and  in  doing  so  issue  the  command 

Name  every  substitution  for  the  schematic  variables  in  q  whose  application  (3.10) 

to  q  results  in  a  retrievable  sentence. 

This  all  but  specifies  a  retrievability  relation  that  handles  wh-queries  in  the  form  of 
schematic  queries.  In  (3.10)  the  meaning  of  "retrievable"  is  precisely  that  given  by  the  retrieva¬ 
bility  relation  hjjp  .  So,  the  specification  of  retrievability  formalized  in  this  chapter  does  not 
fold,  spindle  or  mutilate  RP— rather  it  defines  retrievability  of  schematic  queries  in  terms  of 
substitution  and  RP-entailment. 

Following  the  next  section’s  formalization  of  the  notion  of  substitution,  Section  3.3 
specifies  the  retrievability  relation  and  Section  3.4  presents  a  retrieval  algorithm  that  meets  the 
specification.  As  the  reader  familiar  with  automated  deduction  might  suspect,  this  algorithm 
uses  unification  to  lift  the  Ground  Sequent  Retrieval  Algorithm  to  the  schematic  level. 
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3.2.  Substitutions  and  Unifiers 

This  section  presents  the  basic  and  fairly  standard  definitions  and  theorems  concerning 
substitutions  and  unifiers.  The  reader  familiar  with  this  subject  may  wish  to  skim  the  section 
to  familiarize  himself  with  my  notation.  I  have  attempted  to  include  those  and  only  those 
definitions  and  theorems  on  which  the  remainder  of  this  chapter  rests.  Plotkin  (1972),  Robin¬ 
son  (1979),  Huet  and  Oppen  (1980),  and  Eder  (1985)  cover  this  topic  in  more  detail. 

Intuitively,  a  substitution  is  a  function  that  maps  an  expression  such  as  "x  sells  y"  to 
another  expression  such  as  "Wilson  sells  designer  drugs."  Basically,  the  mapping  works  by 
replacing  variables  with  expressions  while  leaving  everything  else  alone. 

I  follow  the  common  conventions  of  naming  substitutions  by  Greek  letters  (in  this  case,  8, 
a,  7,  p,  X  and  0)  and  writing  the  application  of  a  substitution  6  to  an  expression  e  as  ed  rather 
than  8(e). 

Definition:  Substitution 

A  substitution  is  a  function  8  from  expressions  to  expressions  such  that  for  every  expression  e : 

(1)  If  e  is  a  constant  then  e8  =  e. 

(2)  If  e  is  composed  of  ev  e2,  .  ■  ■  ,  en  then  ed  is  composed  of  ex8,  e20,  ...  ,en6  in  the 

same  manner. 

(3)  If  e  is  a  variable  then  ed  is  an  expression. 

A  substitution  may  also  be  applied  to  a  set  or  a  tuple  of  expressions;  in  such  a  case  the 
substitution  is  merely  applied  to  each  expression  in  the  set  or  tuple.  Thus,  if  E  is  a  set  of 
expressions,  then  E6  =  {ed\e  (zE)  and  if  T  =  (e1,e2,  .  .  .  ,  en)  is  an  n-tuple  of  expressions,  then 
?8  =  (el8,  e2d,  .  .  .  ,  en8). 

Observe  that  substitutions  are  uniquely  determined  by  their  treatment  of  the  variables; 

that  is,  81=62  iff  vdx  =  v02  for  every  variable  v.  The  substitutions  that  arise  in  this  work  are 

such  that  vO^v  for  only  a  finite  number  of  variables  v.  By  an  abuse  of  terminology,  the  set  of 
all  such  variables  is  called  the  domain  of  6,  or  simply  DOM (6).  8  is  said  to  be  a  substitution  for 
a  set  of  variables  V  if  DOM(6)C  V. 

VARS(e)  denotes  the  set  of  all  variables  that  occur  in  expression  e.  If  it  is  empty,  e  is 
said  to  be  ground.  A  substitution  is  said  to  be  ground  if  it  maps  every  variable  in  its  domain  to 
a  ground  expression. 

e1  is  said  to  be  an  instance  of  e  if  e'  =  ed,  for  some  substitution  8.  Of  particular  interest 
are  the  ground  instances  of  an  expression — those  instances  that  have  no  variables.  If  e  is  an 
expression,  then  ejr  is  the  set  of  all  of  its  ground  instances,  and  if  £  is  a  set  of  expressions,  then 
E}r  is  the  set  of  all  ground  instances  of  all  expressions  in  E . 

An  algorithm  needs  a  systematic  way  of  naming  each  substitution  it  uses  with  a  finite 
expression.  Since  we  are  only  concerned  with  substitutions  that  have  finite  domains,  this  can  be 


accomplished  in  the  following  straightforward  way.  Suppose  that  the  domain  of  a  substitution 
is  {zvz2,  ■  .  .  ,2„}  and  that  each  z{  is  mapped  to  some  expression  Then  this  substitution  is 
denoted  by  {t1/x j,  t2/z2  ,  .  .  .  ,  iB/xn).  This  naming  scheme  is  also  used  in  the  text. 

Substitutions,  being  functions,  can  be  composed.  If  8  and  a  are  substitutions  then  their 
composition,  6-a ,  is  \e.a(9(e)).  In  other  words,  9mo  is  such  that  e{8-o)  =(e 8)a  for  all  expressions 
e . 

The  following  lemmas  concerning  substitution  are  crucial  for  this  work  but  are  stated 
without  proof  because  they  are  so  widely  known.  (See,  for  instance,  Robinson  (1979)  or  Love¬ 
land  (1978).) 

Substitution  Lemmas 

(1)  The  identity  function,  henceforth  denoted  by  e,  is  a  substitution.  Furthermore, 
$■€  =e-8  =9,  for  any  substitution  6. 

(2)  If  a  and  9  are  substitutions,  then  so  is  a -6. 

(3)  The  composition  of  substitutions  is  associative.  That  is,  {01-82)-63  =  9l-(92-93),  f°r  sub- 
stitutions  9lt  92 ,  and  93. 

Because  of  this  last  lemma,  parentheses  are  not  needed  when  writing  compositions  of  sub¬ 
stitutions. 

8  is  said  to  be  a  renaming  substitution  iff  for  any  variables,  z  and  y,x8  and  y8  are  vari¬ 
ables  and  z9~y6  iff  z~y.  Expression  ex  is  a  variant  of  expression  e2  ifT  e29  =  e1  for  some 
renaming  substitution  8. 

Given  expressions  e  and  e'  we  will  often  want  to  know  whether  egr  and  e'gr  intersect. 
This  is  equivalent  to  determining  whether  {e,e'\  is  unifiable,  in  the  following  sense. 

Definition:  Unifier 

Let  £  be  a  set  of  expressions  and  9  be  a  substitution.  If  E9  is  a  singleton  then  9  is  said  to  be  a 
unifier  of  E  or,  alternatively,  9  is  said  to  unify  E . 

Often  we  will  want  to  compute  the  set  of  all  unifiers  of  E,  but  this  may  be  infinite.  Luck¬ 
ily,  we  need  not  be  concerned  with  substitutions  that  can  be  obtained  from  others  by 
composition — we  only  need  a  (hopefully  finite)  basis  from  which  we  can  generate  all  substitu¬ 
tions  in  the  set.  This  can  be  done  by  ordering  the  substitutions  and  forming  the  basis  of  a  set 
of  substitutions  from  certain  representatives  of  its  maximal  elements. 

Definition:  More  General 

Substitution  9X  is  more  general  than  substitution  92  (written  9l>92)  iff  9i-o  —  82  for  some  substi¬ 
tution  a. 


-  41 


Note  that  >  is  reflexive  and  transitive  but  not  anti-symmetric.  Hence,  despite  its  typo¬ 
graphic  appearance,  >  is  not  a  partial  order. 

Definition:  Complete  Set 

Let  ©'  and  0  be  sets  of  substitutions.  Then  0'  is  a  complete  set  of  0  iff: 

(1)  ©'  is  correct;  if  0'G©'  and  6'>0  then  0  6  0. 

(2)  0(  is  complete;  if  0  60  then  for  some  ^68', 

Not  every  set  of  substitutions  has  a  complete  set.  For  instance,  if  6  =  {f  (y)  / x  \  then 
©  =  {(?}  does  not  have  a  complete  set.  For  if  it  did,  the  completeness  condition  insists  that  such 
a  set  contains  a  substitution  u>0.  However,  the  correctness  condition  would  then  be  violated 
because  0  fails  to  contain  every  substitution  less  general  than  a — for  instance,  {f(a)/x  |. 

When  a  set  of  substitutions  does  have  a  complete  set,  it  has  many.  However,  they  are  not 
all  created  equal;  some  are  smaller  than  others.  Let  us  consider  an  example.  The  set  of  unifiers 
of  {x,a}  has  many  complete  sets,  including  {{a/xj}  and  {{a  /x},{a  /x,b  /y }}.  The  second  set  is 
larger  than  the  first  because  it  contains  redundant  information;  notably,  one  of  its  substitutions 
is  less  general  than  the  other.  Accordingly,  when  representing  a  set  of  substitutions  with  a 
complete  set  it  is  economical  to  use  a  complete  set  that  is  most  general  in  the  following  sense: 

Definition:  Most  General  Set  of  Substitutions 

A  set  of  substitutions  is  most  general  if  it  does  not  contain  two  distinct  substitutions  such  that 
one  is  more  general  than  the  other. 

I  often  will  be  concerned  with  most  general  complete  sets  of  the  unifiers  of  some  set  of 
expressions  E  and  therefore  call  such  a  set  an  MGCU  of  E .  The  above  example  states  that  the 
set  of  all  unifiers  of  {a,x}  has  a  complete  set.  In  fact,  the  set  of  all  unifiers  of  any  given  set  of 
expressions  has  a  complete  set.  This  is  a  consequence  of  Robinson’s  (1965)  celebrated 
Unification  Theorem,  which  also  states  that  the  Unification  Algorithm  computes  MGCU’s  of 
cardinality  1. 

Unification  Theorem 

Let  E  be  any  finite  set  of  expressions.  Then  E  is  unifiable  iff  the  Unification  Algorithm  so  indi¬ 
cates  upon  termination.  Moreover,  the  substitution  a  then  available  as  output  is  such  that  [crj 
is  an  MGCU  of  E . 

Often,  we  will  be  concerned  with  the  behavior  of  a  substitution  on  only  a  certain  set  of 
variables.  This  motivates  the  definition  of  the  restriction  of  a  substitution. 


Definition:  Restriction  of  a  Substitution 

If  V  is  a  set  of  variables  and  0  is  a  substitution,  then  6  restricted  to  V ,  or  0/V,  is  the  substitu¬ 
tion  such  that  for  every  variable  v. 

(1)  if  vE  V  then  vO/V  =  v0. 

(2)  if  vg  V  then  v$/V  —  v. 

We  will  have  occasion  to  use  the  following  lemmas,  which  follow  immediately  from  the 
definition  of  "restriction." 

Restriction  Lemmas 

Let  He  a  substitution,  V  be  a  set  of  variables,  and  e  be  an  expression.  Then: 

(1)  VARS(e)  C  V  implies  e$  =  e  0/V. 

(2)  0/V  >  e. 

(3)  (0,-OJ/V  >  h/V*v 

3.3.  The  Specification  of  Retrievability  of  Schematic  Sequents 

Recall  that  our  goal  is  to  specify  a  retriever  that  responds  to  a  queried  sentence  schema  by 
supplying  a  substitution  for  the  schematic  variables  in  the  sentence  schema.  This  idea  is  now 
formalized  starting  with  the  definitions  of  "schematic  variable"  and  "sentence  schema.” 

Schematic  variables  are  meta-linguistic  variables  that  range  over  the  terms  of  the  object 
language  (QFPC,  in  this  case).  Sentence  schemas  are  identical  to  sentences  except  that 
schematic  variables  may  appear  anywhere  that  terms  may.  In  particular,  every  ground  sen¬ 
tence  is  a  sentence  schema  that  happens  not  to  contain  any  schematic  variables.  Schematic 
sequents  can  be  constructed  from  schematic  sentences  just  as  ground  sequents  are  constructed 
from  ground  sentences. 

Schematic  variables  are  distinct  from  the  so-called  logical  variables,  object-language  vari¬ 
ables  used  for  quantification.  To  reflect  this  distinction,  schematic  variables  always  wear  hats, 
e.g.,  x,  y,  and  z.  The  distinction  between  schematic  and  logical  variables  is  not  important  in 
this  chapter  because  QFPC  sequents  do  not  contain  logical  variables.  However  with  the  intro¬ 
duction  of  quantification  in  the  next  chapter,  logical  variables  will  occur  in  sequents  and  the  dis¬ 
tinction  will  be  vital.  To  help  keep  the  distinction  clear,  I  use  the  expression  SVARS(e)  to  refer 
to  the  set  of  schematic  variables  in  e  and  LVARS(e)  to  refer  to  the  set  of  logical  variables  in  e. 

Though  sentence  schemas  are  neither  true  nor  false,  the  domain  of  the  retrievability  rela¬ 
tion  can  be  extended  to  include  schematic  sequents.  The  motivation  for  this  extension  is  the 
idea  that  a  schematic  query  should  succeed  iff  one  of  its  ground  instances  is  retrievable.  In 
addition,  schematic  sentences  can  be  contained  in  a  KB  with  the  understanding  that  this  is 
equivalent  to  a  KB  that  contains  all  instances  or  its  schemas.  The  specification  of  retrievability 
of  schematic  sequents  is  stated  more  precisely  by  the  following  definition. 


Definition:  Retrievability  of  Schematic  Sequents 

Let  kb^q  be  a  schematic  sequent  of  QFPC  and  6  be  a  ground  substitution  for  SVARS(q). 
Then  q  is  retrievable  from  kb  with  answer  6  iff  kbgr  b^p  q6.  Additionally,  0  is  said  to  be  an 
answer  to  kb=>q. 

Notice  that  a  retriever  meeting  this  specification  is  sound  in  that  any  answer  6  to  kb  =>q  is 
such  that  kbgr  qd.  Also  notice  that  if  a  ground  query  is  retrievable,  it  is  retrievable  with  only 
one  answer,  e.  What’s  more,  this  definition  of  retrievability  and  the  definition  of  the  last 
chapter  coincide  over  the  set  of  ground  sequents.  That  is,  a  ground  sequent  kb=Sq  is  in  the 
above  retrievability  relation  iff  kb  b^p  q. 

Givei.  a  sequent  we  could  consider  various  problems:  the  problem  of  deciding  whether  it 
has  an  answer,  the  problem  of  finding  some  specified  number  of  answers  to  it,  or  the  problem  of 
finding  all  answers  to  it.  Since  a  solution  to  the  last  of  these  would  provide  a  solution  to  the 
previous  two,  we  henceforth  only  consider  the  problem  of  finding  all  answers  to  a  sequent. 
Therefore,  every  retrieval  problem  can  be  characterized  by  a  finite  schematic  sequent;  kb=sq 
characterizes  the  problem  of  finding  every  6  such  that  q  is  retrievable  from  kb  with  answer  6. 

In  response  to  a  query,  the  obvious  thing  for  a  retriever  to  do  is  to  hand  back  a  list  of  all 
answers.  However,  this  is  impossible  because  many  schematic  sequents  have  an  infinite  set  of 
answers.  For  example,  among  the  answers  to  P(f  (x))=$P(y)  are 

</(«)/$),  {/(/(«))/$),  {/(/(/(«)))/$},  •••  (3-H) 

The  solution  to  this  difficulty  lies  in  finding  a  way  to  finitely  characterize  infinite  sets  of 
answers.  Fortunately  such  a  method  is  at  hand.  Notice  that  the  substitutions  in  (3.11)  are  all 
less  general  than  a  —  {f(z)/y  }•  Moreover,  every  ground  substitution  for  {y }  that  is  less  general 
than  a  is  an  answer.  Therefore,  a  can  be  used  to  characterize  an  infinite  set  of  answers,  and 
accordingly  is  called  a  generalized  answer.  Note,  however  that  a  itself  is  not  an  answer  because 
ya  is  not  ground. 

Definition:  Generalized  Answer 

Let  kb=$q  be  a  schematic  sequent  of  QFPC  and  7  be  a  substitution  for  SVARS(q).  Then  7  is  a 
generalized  answer  to  kb=$q  iff  every  ground  substitution  #<7  for  SVARS(q)  is  an  answer  to 
kb=*q. 

The  set  of  generalized  answers  to  a  sequent  is  no  smaller  than  the  set  of  answers  to  that 
sequent;  in  fact,  every  answer  is  a  generalized  answer.  However,  generalized  answers  have  the 
important  property  that  they  can  be  characterized  by  finite  complete  sets.  Furthermore,  given 
a  complete  set  of  generalized  answers  T  to  kb=>q ,  it  is  easy  to  recover  the  set  of  all  answers:  it 
is  simply 

{ 0\Q  is  a  ground  substitution  for  SVARS(q),  and  0< 7  for  some  7Cf| 
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This,  then,  provides  the  solution:  1  characterize  the  set  of  answers  to  a  sequent  by  a  finite 
complete  set  of  its  generalized  answers.  This  characterization  depends  on  every  finite  schematic 
sequent  of  QFPC  having  a  finite  complete  set  of  generalized  answers.  Indeed,  this  is  the  case 
and  is  so  proved  in  the  next  section  where  I  present  a  retrieval  algorithm  and  prove  that,  when 
input  a  finite  schematic  sequent  of  QFPC,  the  algorithm  always  halts  and  outputs  a  finite  com¬ 
plete  set  of  generalized  answers  to  the  input  sequent. 


3.4.  The  Retrieval  Algorithm 

This  section  addresses  the  problem  of  computing  a  complete  set  of  generalized  answers  to 
a  schematic  sequent.  Once  again  the  solution  is  divided  into  two  stages:  the  first  transforms  an 
arbitrary  sequent  into  a  CNF  sequent  and  the  second  computes  a  complete  set  of  generalized 
answers  to  a  CNF  sequent. 

The  Conjunctive  Normal  Transformation  can  be  applied  to  sentence  schemas  in  the  same 
way  that  it  is  applied  to  ground  sentences;  a  schematic  variable  is  treated  just  like  any  other 
term.  As  the  following  theorem  tells  us,  the  CNT  preserves  retrievability  and  therefore  every 
retrieval  problem  can  be  transformed  into  a  CNF  retrieval  problem.  One  observation  lies  at 
the  root  of  this  theorem’s  proof:  the  composition  of  the  CNT  with  any  substitution  is  commuta¬ 
tive.  In  other  words,  CNT(a6)  =  CNT(a)9,  for  every  substitution  6  and  sentence  schema  a. 

CNT  Theorem  for  Schematic  Sequents 

Let  kb=*q  be  a  schematic  sequent  of  QFPC.  Then  6  is  an  answer  to  kb=*q  iff  it  is  an  answer  to 
the  CNT  of  kb=>q. 

Proof 

Consider  the  two  conditions  necessary  for  q  to  be  retrievable  from  kb  with  answer  6.  First,  6 
must  be  a  ground  substitution  for  SVARS(q),  which  obviously  is  the  case  iff  it  is  a  ground  sub¬ 
stitution  for  SVARS(CNT(q)).  Second,  kbgr  must  RP-entail  q6.  By  the  Conjunctive  Normal 
Transformation  Theorem,  this  is  the  case  iff  CNT(kb}r)  CNT(qO),  which,  by  commuta¬ 
tivity,  is  the  case  iff  CNT(kb)gr  h^P  CNT{q)6.  ■ 

Once  a  schematic  sequent  has  been  transformed  into  CNF,  the  Schematic  Sequent 
Retrieval  Algorithm  (or  SSRA)  below  can  be  used  to  compute  a  complete  set  of  general  answers 
to  it.  The  essence  of  the  SSRA  is  that  its  computations  are,  in  some  sense,  schemas  for  the 
computations  performed  by  the  Ground  Sequent  Retrieval  Algorithm.  This  method  of  handling 
sentence  schemas  is  commonplace  in  the  field  of  automated  deduction,  where  the  schematic 
computations  are  said  to  lift  the  ground  computations.  The  correctness  of  the  SSRA  is  esta¬ 
blished  via  the  Lifting  Theorem,  which  says  that  every  ground  instance  of  a  schematic  proof  is 
a  ground  proof  and  every  ground  proof  is  an  instance  of  some  schematic  proof. 
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Schematic  Sequent  Retrieval  Algorithm 

Input:  kb=*q,  a  schematic  CNF  sequent  of  QFPC 

Output:  SUCCESS  or  FAILURE;  if  SUCCESS  then  a  substitution  is  also  output 

(1)  let  s  =  number  of  conjuncts  in  q 

(2)  let  =  iih  conjunct  of  q  (l<t<s) 

(3)  let  K  be  the  set  containing  every  conjunct  of  every  sentence  in  kb 
(3')  let  0O  =  e 

(4)  for  i  =  1  to  s  do 

(5)  choose  to  do  either  step  A)  or  step  B) 

(6)  A)  choose  pt,  a  positive  literal  in  qi 

(7)  choose  n,,  the  complement  of  a  negative  literal  in  qi 

(8)  if  n;0,_x  and  pt0t_j  are  unifiable 

(9)  then  let  £/,  be  any  MGCU  of  «,©,_!  and  pt©,_j 

choose  6t  6  Ui 

let  0,  =  {e-6^6 2  •  •  •  0,)/UAR%) 

(10)  else  FAIL 

(11)  B)  choose  6,  G K  and  rename  it  so  that  VARS(bi)f^VARS(qQi_1)  =0 

(12)  let  /,  j ,  .  .  .  ,  li  m  be  the  literals  of  b{ 

(13)  choose  F,,  a  total  function  from  LITERALS (6,)  to  LITERALS(qi ) 

(14)  if  (/,  J,  .  .  .  ,  li  m)  and  (F.K.x),  .  .  .  ,  F,(/,  J)0,_,  are  unifiable 

(15)  then  let  U,  be  any  MGCU  of  (liv  l,J  and  (F,(ltl),  Ft(/,.  m))01_1 

choose  6{  £  Ut 

let  0,  =  (e-0^02  ■  ■  •  e^/VARSiq) 

(16)  else  FAIL 

(17)  SUCCEED  and  output  0, 

As  before,  the  SSRA  defines  a  provability  relation. 

Definition:  SSRA-provable 

Let  kb=>q  be  a  schematic  CNF  sequent  of  QFPC.  Then,  q  is  SSRA-provable  from  kb  (written 
kb  1^7  7)  with  extracted  answer  6  iff  there  is  some  sequence  of  choices  for  which  the  Schematic 
Sequent  Retrieval  Algorithm  halts  with  SUCCESS  and  outputs  6  when  input  kb=>q. 

The  principal  result  of  this  section  is  that  every  extracted  answer  is  a  generalized  answer 
and  that  the  set  of  extracted  answers  produced  by  all  proofs  is  a  complete  set.  Before  turning 
our  attention  to  the  series  of  results  that  lead  to  this  conclusion,  let  us  examine  the  SSRA  itself, 
particularly  by  comparison  to  the  GSRA. 

The  Ground  Sequent  Retrieval  Algorithm  and  the  Schematic  Sequent  Retrieval  Algorithm 
have  the  same  form.  This  is  reflected  by  the  step  numbering,  which  indicates  the 
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correspondence  of  the  steps  in  the  two  algorithms. 

The  most  prominent  addition  incorporated  in  the  SSRA  is  the  sequence  of  substitutions 
0O,  0j  ,  .  .  .  Each  0,  L  called  the  ith  partial  extracted  answer  and,  as  previously  defined, 

0,  is  called  the  extracted  answer.  Each  0,  is  a  maximally  general  substitution  that  can  be 
applied  to  q  in  order  that  the  computation  can  complete  t  iterations  making  the  choices  that  it 
did.  For  0<t'<s,  0,  ={e-6l  •  •  •  6^/VARS^)  where,  as  we  shall  see,  0,  is  a  maximally  general 
substitution  that  can  be  applied  to  g0-_,  in  order  that  the  computation  can  complete  the  ilh 
iteration  given  the  choices  that  have  been  made  on  the  first  t— 1  iterations.  Accordingly,  0O  is 
initialized  to  e  just  before  the  start  of  the  first  iteration  (in  step  (3')  and  each  successive  0,  is 
computed  during  the  itk  iteration.  Note  also  that  0,_j<0,,  for  l<t<s. 

Now  let  us  consider  how  each  0,,  and  consequently  0t,  is  derived.  The  equality  tests  of  (8) 
and  (14)  have  been  replaced  with  unifiability  tests.  This  is  what  one  expects;  two  schematic 
expressions  are  unifiable  iff  some  ground  instance  of  the  first  schema  is  equal  to  a  ground 
instance  of  the  second.  In  each  case,  if  the  unifiability  test  succeeds,  then  0,  is  chosen  from 
among  the  elements  of  an  MGCU.  0,- ,  as  computed  in  steps  (9)  and  (15),  is  a  maximally  general 
substitution  that  allows  the  expressions  of  steps  (8)  and  (14),  respectively,  to  be  unified.  Once 
0i  is  computed,  steps  (9)  and  (15)  each  proceed  to  compute  0,. 

The  only  other  difference  between  the  SSRA  and  the  GSRA  is  located  in  step  (11).  Here, 
the  schematic  algorithm  renames  6,  in  older  to  avoid  variable  name  clashes. 

The  sequence  of  0,’s  computed  by  a  schematic  computation  encodes  the  relationship 
between  the  schematic  computation  and  all  of  the  ground  computations  that  are  instances  of 
the  schema.  Consider  the  example  of  retrieving  q  =P(w,z,u)aQ(w,z,u)  from 
kb  ~{P(x, a, v),  Q(a,x,v)}.  The  first  iteration  computes  0j  =  {a/z,  w/x,  u/v\  and 
Ql=c8l/V  =  {a  /z}.  This  indicates  that  every  ground  query  that  is  an  instance  of 
9©!  —  P(w ,a ,u)/\Q(w ,a ,v)  could  also  have  completed  this  first  iteration.  The  second  iteration 
must  find  the  instances  of  Q(w,a,u)  that  succeed  and  in  doing  so  computes  02  =  {o  /w,v  /el¬ 
and  02  =  {a/w,a  jz }.  This  indicates  that  every  ground  query  that  is  an  instance  of 
?02=,P(a,a,u)A<2(a,a,u)  also  could  have  completed  the  two  iterations.  Hence,  as  desired,  02 
is  a  generalized  answer  to  kb=$q. 

Let  us  now  observe  that  every  step  of  the  SSRA  is  effectively  computable.  To  see  this,  we 
need  only  consider  the  operations  that  are  part  of  the  SSRA  but  not  the  GSRA: 

(1)  deciding  whether  two  expressions  are  unifiable, 

(2)  finding  an  MGCU  of  two  unifiable  expressions, 

(3)  applying  a  substitution  to  an  expression, 

(4)  restricting  the  domain  of  a  substitution,  and 

(5)  composing  two  substitutions. 


The  Unification  Theorem  tells  us  that  (1)  and  (2)  are  finitely  computed  by  the  Unification  Algo¬ 
rithm.  Since  substitutions  produced  by  that  algorithm  have  finite  domains,  they  can  be 
represented  as  sets  such  as  { a/x ,  b/y).  Using  this  representation  (3),  (4)  and  (5)  can  each  be 
computed  straightforwardly,  and  furthermore,  the  substitutions  resulting  from  (4)  and  (5)  also 
have  finite  domains. 

We  also  can  observe  that  the  SSRA  always  terminates  when  input  a  finite  sequent.  As  dis¬ 
cussed  in  the  previous  chapter,  this  claim  is  contingent  on  every  non-deterministic  choice  being 
made  from  a  finite  set.  Except  for  the  choices  made  in  steps  (9)  and  (15),  this  observation  has 
already  been  made  in  the  course  of  examining  the  GSRA.  Steps  (9)  and  (15)  pose  no  problem 
because  the  Unification  Theorem  tells  us  that  these  choices  are  made  from  finite  sets — indeed, 
singleton  sets. 

It  may  seem  odd  that  the  algorithm  is  stated  in  such  a  way  that  a  non-deterministic 
choice  is  made  from  a  singleton  set.  The  algorithm  turns  its  back  on  the  Unification  Theorem, 
and  so  does  the  theoretical  analysis  that  follows.  Only  in  the  above  discussion  of  termination  is 
the  size  of  MGCU’s  considered,  and  even  there  only  their  finiteness  matters.  By  ignoring  the 
size  of  the  MGCU’s,  both  the  algorithm  and  the  analysis  achieve  greater  generality,  generality 
that  is  exploited  in  Chapter  5.  That  chapter  introduces  a  variety  of  unification  for  which  non¬ 
singleton  MGCU’s  exist.  Since  the  results  of  this  chapter  are  independent  of  the  size  of  the 
MGCU’s,  they  can  be  carried  over  intact  and  used  in  Chapter  5. 

Steps  (8)  and  (14)  do  not  say  how  to  decide  unifiability  nor  do  steps  (9)  and  (15)  say  how 
the  MGCU’s  are  to  be  computed.  Though  the  Unification  Theorem  points  to  the  Unification 
Algorithm  as  one  method  that  could  be  used,  neither  the  algorithm  nor  the  analysis  insist  on 
this.  For  example,  the  proof  of  the  algorithm’s  correctness  is  not  contingent  on  whether  the 
MGCU  is  or  is  not  one  that  could  be  computed  by  the  Unification  Algorithm.  Though  1  do  not 
do  so,  this  app.oach  could  be  taken  one  step  further,  for  the  correctness  of  the  algorithm  in  no 
way  depends  on  whether  the  complete  sets  of  unifiers  are  most  general.  The  use  of  complete 
sets  that  are  not  most  general  would  only  introduce  redundancy  into  the  search  spaces  of  the 
algorithm.  In  the  extreme  case,  the  use  of  infinite  complete  sets  would  yield  infinite  search 
spaces  and  render  the  guarantee  of  termination  null  and  void. 

The  SSRA  implicitly  defines  a  search  space  in  the  same  way  that  the  GSRA  does.  The 
only  differences  are  that  the  arcs  of  an  SSRA  search  space  are  labeled  with  an  additional  equa¬ 
tion  to  show  the  value  of  6t,  and  that  SUCCESS  nodes  are  labeled  with  the  corresponding 
extracted  answer.  By  the  argument  above,  the  SSRA  implicitly  defines  a  finite  search  space  for 
every  finite  sequent.  Figure  3.1  displays  an  example  of  an  SSRA  search  space,  that  of 
P(a),  R(z)=>(P(x)\/Q(x))A(R{y)\/~'R{i)-  To  expedite  the  display  and  the  discussion  that  fol¬ 
lows,  the  labels  on  search  space  arcs  are  condensed  and  written  as  four-tuples  in  one  of  two 
forms: 
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M-  P, ,  n,,  tf,),  or 

(S.  UF, . F.tt.J),  «,)■ 


Examination  of  the  SSRA  and  the  GSRA  shows  that  they  behave  identically  when  input  a 
ground  sequent.  This  is  stated  by  the  Ground  Equivalence  Theorem. 


Ground  Equivalence  Theorem 

Let  kb=*q  be  a  ground  CNF  sequent  of  QFPC.  Then  M  fcssx  0  iff  kb  1^7  q.  Moreover,  the 
Ground  Sequent  Retrieval  Algorithm  and  the  Schematic  Sequent  Retrieval  Algorithm  implicitly 
define  isomorphic  search  spaces.  They  differ  only  in  that  every  arc  in  the  schematic  search 
space  has  an  additional  label  of  the  form  "0,  =e". 


Now  that  we  see  how  the  GSRA  relates  to  the  SSRA  on  ground  sequents,  let  us  see  how 
the  SSRA  on  ground  sequents  relates  to  the  SSRA  on  schematic  sequents.  1  have  already  men¬ 
tioned  that  computations  with  schematic  sequents  are  themselves  schematic  for  computations 
on  ground  sequents.  This  notion  is  captured  by  the  definition  of  lifting. 


Definition:  Lifting 

Let  D  be  a  length  s  derivation  from  kb=>q  with  partial  extracted  answers  0o,0j,  .  .  .  ,  ©s  and 
let  D'  be  a  length  s  derivation  from  kb'^q1  with  partial  extracted  answers  0O',0]',  .  .  .  ,0,'. 
Then  D  lifts  D1  provided  that  these  conditions  are  met  for  1  < t  < s : 

•  The  itK  arc  of  D  is  labeled  with  step  A  iff  the  «lh  arc  of  D1  is. 

•  If  the  ith  arc  of  D  is  labeled  (A,nit  p,,  0,)  and  the  ilh  arc  of  D'  is  labeled 
{A ,  n/,  p,',  0,')  then  (n/,  p,')  is  an  instance  of  (n,,  p,}. 

•  If  the  ilh  arc  of  D  is  labeled  (B ,  6,,  (<pl,  .  .  .  ,<pm),  0,)  and  the  ith  arc  of  D'  is  labeled 

( B ,  6,',  (<£/,  .  .  .  0/)  then  6/  is  an  instance  of  6,,  and  {<£,',  .  .  .  is  an  in¬ 

stance  of  (4>v  .  .  . 

•  q'[f8 1  •  •  •  0,')  is  an  instance  of  q{t  Ol  •  •  •  0,),  i.e.,  ©,'<0,- 


For  example,  the  proof  displayed  along  the  left  path  of  Figure  3.1  lifts  the  (one  and  only) 
proof  of  P{a),  R(a),  R(b)=>(P(a)\/Q(a))A(R(a)V—<R(a))  and  the  one  along  the  right  path  lifts 
the  (one  and  only)  proof  of  P(a),  R(a ),  /?(6)=>(P(a)V^(a))A(/?(0)V-'/?(a)).  Figures  3.2  and 
3.3  show  these  proofs. 


(B,P(a),  ( P(x)),{a/x\ ) 

{A,  R(x),  R(y),  {a/y  }) 
SUCCEED  with  {a/x,  a/y\ 
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(B,P(a),  (P(a)),  e) 
(A  ,  R(a),  R(a),  e) 
SUCCEED  with  f 


Left:  Proof  of  P(a),  R(z)=>(P(x)vQ(£))A(R(y)V^R(£)) 

Right:  Proof  of  P(a),  R(a),  R(b)=$(P(a)\/  Q(a))A{R(a)\/-'R(a)) 

Figure  3.2:  Proof  at  Left  Lifts  Proof  at  Right 


(B,P(a),(P(x)),{a/x  })  (B ,  P(a),  (P(a)),  e) 

(B,R(z),(R(y)),{y/z})  (B,  R(b),  (R(b)),  e) 

SUCCEED  with  {a/x  }  SUCCEED  with  e 

Left:  Proof  of  P{a),  R{z)=>(P{x)\/Q(x))A{R{y)V^R(i)) 

Right:  Proof  of  P(a ),  R(a),  R(b)=$(P(a)\/Q(a))A(R(b)V-iR(a)) 

Figure  3.3:  Proof  at  Left  Lifts  Proof  at  Right 


Though  ground  proofs  such  as  those  shown  above  are  of  particular  interest,  they  are  not  the 
only  proofs  that  can  be  lifted.  For  instance,  Figure  3.4  displays  a  non-ground  proof  that  is 
lifted  by  the  right  path  of  Figure  3.1.  Note  that  the  proof  in  Figure  3.4  lifts  the  right  proof  of 
Figure  3.3. 


(B,P(a),  (P( x)),{a/x\) 

{ B,R(b ),  (R(b)),e) 

SUCCEED  with  {a/x  | 

Proof  of  P(a),  R(a),  R{b)=>(P(x)vQ(x))A(R(b)\/-,R(x)) 

Figure  3.4:  A  Non-Ground  Proof  that  is  Lifted 


As  a  result  of  the  lifting  relationship  between  proofs,  the  Lifting  Theorem  is  obtained. 


Lifting  Theorem 

Let  kb=*q  be  a  schematic  CNF  sequent  of  QFPC  and  a  be  a  ground  substitution  for  SVARS(q). 
Then  kbgr  ^77  qo  iff  for  some  7 >tr,  kb  [7777  q  with  extracted  answer  7. 

Proof 

I  refer  to  a  proof  of  kbgr=sqo  as  P1  and  a  proof  of  kb=>q  as  P.  Furthermore,  I  refer  to  the 
values  produced  by  P1  by  superscripting  them  with  '  in  order  to  distinguish  them  from  the 
values  produced  by  P,  which  are  not  superscripted.  For  instance,  q0',  .  .  .  ,q,'  are  produced  by 
P1  while  q0,  .  .  .  ,qt  are  produced  by  P.  Also,  throughout  this  proof  let  V  =  VARS(q).  First  I 
make  the  general  observation  that  for  0<t<s,  0,  =  (e-flj-  •  •  •  =  (©^‘fl^/V. 

if  clause:  Let  Prop2(j  )  be  the  proposition: 

If  proof  P  traverses  the  loop  j  times  producing  a  partial  extracted  answer  0;  that 
is  more  general  than  a,  then  there  is  a  proof  P1  that  traverses  the  loop  j  times. 

Prop2(s),  which  is  the  if-clause  of  this  theorem,  is  proved  by  induction.  Prop2(0)  trivially 
holds;  0o=c,  which  is  more  general  than  any  <7.  Assuming  Prop2(f—  1)  I  now  prove  Prop2(i), 
for  any  l<t<s.  By  the  induction  hypothesis,  0,_j>ff.  On  the  tth  iteration  of  the  loop,  P  exe¬ 
cutes  either  step  (A)  or  step  (B).  In  each  case  I  show  that  P’  can  successfully  execute  the  same 
step. 

Step  (A):  If  P  successfully  executed  step  (A)  then  0,  unifies  n,0,_j  and  p,0,_j.  Hence  0,  ,•(?, 
unifies  n,  and  p,.  Because  DOM(Ql__1-Ol)C  V,  =  (01_i'^i)/^r>  which  in  turn  equals  0,. 

So,  0,  unifies  n,  and  pt.  Now  P1  can  also  choose  step  (A)  and  can  choose  nt' =  nta  and  p, '  =  pta. 
Since  0,  is  more  general  than  a  and  unifies  p(  and  n,,  a  also  unifies  them  and  hence  p/=n/. 
Therefore  P'  can  successfully  complete  the  ith  iteration. 

Step  (B):  If  P  chooses  step  (B),  6,  and  F,  then  (l,  v  .  .  .  m)  and  {F,(lt  l),  .  .  .  ,F,{1,  m))0,_! 
are  unifiable  by  6t.  Since  ©,><7,  let  p  be  such  that  0,-p  =  cr.  Now,  P'  can  also  choose  step  (B) 
and  can  choose  bt'  and  F\'  so  that  bf  =  btB^p  (we  will  see  shortly  that  this  choice  is  indeed  a 
ground  formula)  and  Fi'(ll  ■')  =  Fi{li  g)a,  for  1  However,  for  F,(ll}  )&  equals 

F{(1,  ;)0,P,  which  in  turn  equals  Ft(li  ;  )®,-\0xp  because  VARS(Ft(l,  j))C  V .  Hence,  for 
l<j<m,  f  f  }6tp  equals  F’l,(/1  /)  =  y Therefore,  P'  can  successfully  execute 
step  (B). 

only-if  clause:  Let  Propl(j)  be  the  proposition: 

If  proof  P'  traverses  the  loop  j  times  then  there  is  a  proof  P  that  traverses  the  loop 
j  times  producing  a  partial  extracted  answer  0;  that  is  more  general  than  a. 

I  prove  Propl(s),  which  is  the  only-if  clause  of  this  theorem,  by  induction.  Propl(O)  trivially 
holds  for  ©0  —  c .  Assuming  Prop(t '  —  1 )  I  now  prove  Prop(i’).  On  the  ilh  iteration  of  the  loop  P‘ 
executes  either  step  (A)  or  step  (B).  Consider  each  possibility: 

Step  (A):  If  P'  chooses  step  (A)  and  n/  and  p/  then  nt'  =  pt‘.  P  can  then  choose  to  do  step  (A) 
and  can  choose  n,  and  p-  so  that  nto  =  nj  and  pta^px'.  Since  0(  ,  - a  (by  the  inductive 
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hypothesis),  there  exists  a  substitution  X  such  that 

©.  1-X=<7.  (3.12) 

Also  because  ©y^^tr,  niOi_1  and  p,0,_j  are  unifiable.  Moreover,  one  element  of  any  MGCU  of 
«,©,_!  and  p,©,_x  is  more  general  than  X.  Let  P  choose  it  as  0,.  Now,  from  (3.12)  it  follows 
that  © and  so  ©(><7.  Therefore  P  can  execute  step  (A)  generating  a  partial  extracted 
answer  0,  ><t. 

Step  (B):  If  P1  chooses  step  (B)  and  4,'  and  F('  then  /t  y,  =  ir,,(/i  ;'),  for  lFj<m .  Then  P  can 
choose  step  (B)  and  6,  and  Fi  so  that  -)<r  =Fi'(li ;'),  for  l<7<m,  and  bip  =  bil  for  some 
substitution  p  whose  domain  is  VA^5(6,).  Then  /,  Jp=F,(/l  y)a,  for  l<j<m.  Because  ©,^><7 
(by  the  inductive  hypothesis),  there  is  a  substitution  X  such  that 


©i-i  "X  =  cr 

(3.13) 

Also  because  ©(_j>05 

li,jP 

and 

are 

unifiable. 

Because 

Z)OM(p)p) ^Ai?S(F,(/(  y)©,_x)  =0,  /,  y  and  Fi(lij)Si_1  are  also  unifiable.  Moreover,  one  ele¬ 
ment  from  any  one  of  their  MGCU’s  is  more  general  than  X.  Let  P  choose  it  as  6t.  Now,  from 
(3.13)  it  follows  that  ©,_j '0i'><T  and  so  ©t>cr.  Therefore,  P  can  execute  step  (B)  producing  a 
partial  extracted  answer  0,  that  is  more  general  than  a.  ■ 

SSRA  Correctness  Theorem 

Let  kb=>q  be  a  schematic  CNF  sequent  of  QFPC  and  V  =  {7 1 4:6  (^7  q  with  extracted  answer 
7}.  Then  T  is  a  complete  set  of  general  answers  to  kb=5q. 

Proof 

9  is  an  answer  to  kb=$q 

iff  kbgr  byjp  qO  (Def.  of  Schematic  Retrievability) 

iff  kbgr  qO  (GSRA  Correctness  Theorem) 

iff  kbgr  1^7  qO  (Ground  Equivalence  Theorem) 

iff  kb  [5^7  q  with  some  extracted  answer  7>0  (Lifting  Theorem) 

Therefore,  if  7  is  an  extracted  answer  then  every  ground  substitution  0< 7  for  SVARS(q)  is  an 
answer.  So  7  is  a  generalized  answer.  Going  the  other  way,  if  9  is  an  answer  then  some  7 ><?  is 
an  extracted  answer.  Hence  T  is  a  complete  set  of  generalized  answers.  This  proof  is  encapsu¬ 
lated  in  Figure  3.5.  ■ 

As  a  consequence  of  this  theorem  we  can  now  see  the  validity  of  the  previously  unsub¬ 
stantiated  claim  that  every  finite  schematic  sequent  has  a  finite  complete  set  of  general 
answers.  Such  a  set  is,  of  course,  T  of  the  Correctness  Theorem.  That  T  is  finite  if  £6  =><7  is 
finite  is  obvious  when  one  recalls  that  every  finite  sequent  has  a  finite  search  space. 

The  Correctness  Theorem  docs  not  claim  that  F  is  a  most  general  complete  set.  and 
indeed,  in  general,  it  is  not  most  general.  Obviously,  F  would  not  be  most  general  if  in  steps 


m 
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(9)  and  (15)  the  SSRA  used  complete  sets  of  unifiers  that  were  not  most  general.  Less  obvi¬ 
ously,  this  can  happen  even  though  MGCU’s  are  used.  Figure  3.1  shows  that  one  extracted 
answer  to  P(a),  R(z)=*(P(z)V Q(x))A(R(y)V~'R(x)),  {a/x},  is  more  general  than  the  other, 
{a/x,  a/ y).  In  some  sense  this  situation  is  purely  coincidental;  it  is  not  caused  by  some 
defect  in  the  algorithm  such  as  a  failure  to  schematize  to  the  most-general  level.  The  second 
arcs  of  the  two  proofs  are  unrelated,  being  computed  by  different  methods;  they  yield  two 
answers  such  that  one  just  happens  to  be  more  general  than  the  other. 

Before  closing  this  section,  we  return  to  the  issue  of  selection  functions.  Like  the  GSRA, 
the  correctness  of  the  SSRA  does  not  depend  on  the  order  in  which  the  conjuncts  of  a  query 
are  worked  on.  This  is  true  of  the  SSRA  for  the  same  reason  that  it  is  true  of  the  GSRA: 
since  the  SSRA  is  correct  for  any  arbitrary  CNF  query,  it  is  correct  for  every  query  obtained 
by  permuting  the  conjuncts  of  that  arbitrary  query. 

However,  unlike  the  GSRA,  the  SSRA  does  not  have  the  stronger  property  of  decomposa- 
bility.  The  conjuncts  of  the  query  P(x)y  Q(x)  cannot  be  retrieved  independently  of  each 
other.  In  retrieving  P(x)  on  the  first  iteration,  the  substitution  0j  is  computed,  and  the 
second  iteration  has  the  task  of  retrieving  Q(j)0j.  Whereas  a  ground  query  has  the  property 
of  being  retrievable  if  each  of  its  conjuncts  is,  a  non-ground  query  does  not  have  this  pro¬ 
perty.  In  particular,  conjuncts  sharing  variables  are  not  decomposable.  For  example,  both 
P{ x)  and  Q(z )  are  retrievable  from  P(a)AQ(b)  but  P(z)aQ(£)  is  not. 

Though  the  choice  of  selection  function  does  not  affect  the  correctness  of  the  SSRA,  it 
can  have  a  drastic  effect  on  efficiency.  This  is  illustrated  by  an  example  drawn  from  Chat-80 
(Warren,  1981;  Warren  and  Pereira,  1982),  a  computer  program  that  accesses  a  simple  KB  in 
order  to  respond  to  English  queries  about  world  geography.  Chat-80’s  KB  contains  atomic 
sentences  describing  the  sort  of  each  geographical  entity  it  knows  about,  for  instance 

Country(US),  Country  (Canada),  Country  (Mexico),  Country  (Iceland),  .  .  . 

Ocean(Atlantic),  Ocean(Pacific),  Ocean(Indian),  .  .  . 

It  also  contains  an  atomic  sentence  for  every  border  shared  by  two  of  these  entities: 

Borders(US ,  Canada),  Borders(US ,  Atlantic),  Borders  (Iceland,  Atlantic),  ... 

To  find  all  countries  bordering  the  U.S.  either  query  (3.14)  or  (3.15)  could  be  issued. 

Country(x)f\Borders(US  ,x)  (3-14) 

Borders(US  ,x)ACountry(x)  (3.15) 

Either  query  does  the  job,  but  the  second  does  it  more  efficiently.  Since  there  are  only  5  enti¬ 
ties  bordering  the  U.S.  but  approximately  150  countries,  it  is  simpler  to  generate  the  5  border¬ 
ing  entities  and  check  whether  they  are  countries  than  to  generate  the  150  countries  and  check 
whether  they  border  the  U.S.  The  search  spaces  for  these  two  queries  are  displayed  below  in 
Figures  3.6  and  3.7.  The  arcs  of  these  search  spaces  are  labeled  only  with  the  value  of  0 1 
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instead  of  an  entire  4-tuple. 

I  use  Chat-80  to  illustrate  the  importance  of  the  selection  function  because  the  program 
contains  a  query-planning  mechanism  that  can  automatically  replace  a  query  like  (3.14)  with 
a  query  like  (3.15).  To  do  this,  the  query  planner  estimates  the  number  of  answers  that  there 
are  to  each  conjunct  in  the  query  and  places  that  conjunct  first.  Then  this  is  repeated  for  the 
remaining  conjuncts,  taking  into  account  that  variables  may  become  instantiated  as  a  result 
of  the  substitutions  computed  by  answering  the  previous  conjuncts.  For  example,  on  the  basis 
of  its  estimates  that  Country (x)  has  150  answers  and  Borders(US  ,x)  has  only  5,  the  query 
planner  chooses  (3.15)  over  (3.14). 

Chat-80’s  query  planner  has  other  capabilities,  of  which  I  illustrate  only  one  more. 
When  confronted  with  the  query 

Country  (x)  ABorders  (US  ,x)ACountry(y )  ABorders  (France  ,y ) 

(which  asks  for  all  countries  bordering  the  U.S.  and  all  countries  bordering  France)  Chat-80 
can  decompose  it  into  two  independent  queries: 

Country  (x)  ABorders  (US  ,x) 

Country  (y ) ABorders  (France  ,y) 

Furthermore,  it  realizes  that  because  of  the  shared  variables,  neither  of  these  two  queries  can 
be  decomposed  further. 

Chat-80’s  query  planner  provides  an  excellent  example  of  some  techniques  that  can  be 
used  to  build  smaller  search  spaces.  I  say  nothing  more  on  this  issue  other  than  to  suggest 
that  similar  techniques  could  be  used  to  reduce  the  search  required  to  answer  the  wh-queries 
of  this  chapter. 
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Figure  3.1:  Search  Space  of  P(a),  R(z)  =>  (P(x)  V  Q(x})  A  (R(y)  v  -.R(x)) 


*h  to,  ?  with  «rtr»cisd  answer  >>9 


<F 


SSRA  CorrcctacBi  Theorem 

. - . t> 


8  is  an  answer 


4^ 


Li f Uni  Theorem 


Definition  of 

Retrieve  btiity 

of  Schematic  Sequent* 


v 


i  8®  <h 


Ground  Equivalence  Theorem 


*»r  to,  ?® 


<J- 


-t> 


OSRA  Correctness  Theorem 


Figure  3.5:  Proof  of  the  SSRA  Correctness  Theorem 


Chapter  4 


A  Retriever  for  a  Quantificational  Language 

This  chapter  extends  the  retriever  of  the  previous  chapter  to  deal  with  the  entire  first- 
order  predicate  calculus,  quantifiers  and  all.  The  first  section  of  this  chapter  shows  that  using 
RP  to  interpret  quantifiers  in  the  usual  fashion  yields  an  undecidable  logic.  Analysis  of  RP 
reveals  how  it  allows  chaining  to  slip  in  subtly,  leading  to  undecidability.  On  the  basis  of  this 
analysis,  the  second  section  develops  a  new  logic,  RQ,  which  is  designed  to  agree  with  RP  on 
retrievability  in  the  absence  of  quantifiers,  yet  be  decidable  in  their  presence.  Section  3  exam¬ 
ines  the  properties  of  RQ,  showing  that  it  agrees  with  RP  on  propositional  retrievability  and 
that  it  has  the  important  Strong  Herbrand  Property,  which  this  chapter  introduces.  On  the 
basis  of  these  two  properties,  the  fourth  and  final  section  demonstrates  how  the  Schematic 
Sequent  Retrieval  Algorithm  can  be  used  to  decide  RQ-entailment. 

The  retriever  that  this  chapter  specifies  operates  on  a  language  called  the  First-Order 
Predicate  Calculus  (FOPC),  which  is  identical  to  QFPC  except  for  the  addition  of  quantifiers 
according  to  the  following  grammatical  rule: 

if  ip  is  a  formula  and  x  is  a  variable  then  3xip  and  Vxt/’  are  formulas. 

As  is  usual,  FOPC  sentences  are  closed  FOPC  formulas. 

The  logic  developed  here,  RQ,  is  defined  only  for  sentences  in  prenex  form.  Hence,  from 
this  point  on,  the  word  "sentence"  only  refers  to  prenex  form  sentences.  A  prenex  form  sentence 
contains  no  quantifiers  in  the  constituent  subformulas  of  any  of  its  logical  connectives.  In  terms 
of  the  surface  syntax,  quantifiers  in  a  prenex  form  sentence  appear  at  the  left  end  of  the  sen¬ 
tence.  Thus,  Vx3y  (P(x)— rR{x, y))  is  in  prenex  form  whereas  Vz  (P(x)— *3y/i  (x,y ))  is  not.  A 
prenex  form  sentence  is  in  universal  prenex  form  iff  it  contains  no  existential  quantifiers;  like¬ 
wise,  it  is  in  existential  prenex  form  iff  it  contains  no  universal  quantifiers. 

A  prenex  form  sentence  divides  neatly  into  two  parts:  a  prefix  containing  all  the 
quantifiers  and  a  matrix  containing  the  entire  quantifier-free  formula  that  appears  to  the  right 
of  the  quantifiers.  Hence,  Vx3 y  (P(x)—*R(x ,y))  is  composed  of  a  prefix  of  Vx3y  and  a  matrix 
of  (P(z)—*R(x,y)).  Ry  convention,  when  I  speak  of  instances  of  a  prenex  sentence  I  am  actually 
referring  to  instances  of  its  matrix. 

4.1.  An  Inadequate  Treatment  of  Quantification 

This  section  investigates  the  outcome  of  giving  quantifiers  their  standard  interpretation  in 
RP- models.  By  the  "standard  interpretation"  for  quantifiers  1  mean  that  a  model  M  with 
domain  I)  assigns  a  value  to  a  quantified  formula  according  to  the  following  semantic 
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equations,  where  e\d/x\  is  a  function  identical  to  e  with  the  possible  exception  that 


e  [d  /x]{x)  =  d: 

True€|Via]]M><  iff  for  all  d  ED ,  True  G  fla]]Mie|i/x| 

False €|ViaJlM,e  iff  for  some  d  £D,  False  <E  [}a]]A/iel<i/l1 

(4.1) 

True  €  J3za]]w’e  iff  for  some  d  £D ,  True  G 

False  eP*a]]M'e  iff  for  all  i£D,  False  G 

(4.2) 

Since  T  uses  the  standard  interpretation  for  quantifiers,  every  T-valuation  is  an  RP- 
valuation.  Thus,  for  FOPC  as  well  as  QFPC,  bj}p  is  weaker  than  b^  . 

We  can  continue  to  view  sentences  as  defining  intensions,  functions  from  A3S  to  A3,  by 
considering  a  quantifier  as  a  logical  connective  with  an  infinite  number  of  arguments.  It  is  easy 
to  see  that  the  intension  of  every  sentence  is  monotonic,  even  in  the  presence  of  quantifiers. 
Hence,  as  before,  the  FOPC  sentences  valid  in  RP  are  precisely  those  valid  in  T. 

This  spells  disaster  because  T-validity  is  only  semi-decidable.  Thus,  it  is  impossible  to 
determine  whether  an  arbitrary  FOPC  sentence  is  retrievable  from  a  KB  even  when  the  KB  is 
empty.  Furthermore,  as  the  following  theorem  states,  the  entire  b^  relation  can  be  mapped 
into  b|p  . 

T  to  RP  Mapping  Theorem1 

Let  kb=*q  be  a  finite  sequent  of  FOPC,  let  Pv  .  .  .  ,Pn  be  the  predicates  occurring  in  kb,  and 
let  be  the  sentence 

(3P1{xv...,xmi)A-'P1(x1,...,xmi))  V  '  •  •  V  (3Pn(x1,...,xmJ/\—iPn(xi,...,xmJ) 

Then  kb  bf  q  iff  kb  \^p  q\J'i/kh.2 

Proof 

if  clause:  Assume  kb  \=RP  q'^'ifkb.  Then  kb  bj-gV'!'^,  and  because  kb  is  T-unsatisfiable, 
kb  bj-  q . 

only-if  clause:  Assuming  kb  bj-  q,  I  show  that  no  RP-model  is  a  countermodel  to  kb  b^p  gV^- 
An  RP-model  that  does  not  satisfy  kb  is  not  a  countermodel.  If  an  RP-model  that  does  satisfy 
kb  is  Tarskian  then  it  satisfies  q  (by  the  assumption);  otherwise  it  satisfies  (since  it  assigns 
{True,  False}  to  some  atomic  formula).  Therefore  no  RP-model  is  a  countermodel  to 
kb  ^p  gV^*6.  ■ 

1  This  theorem  and  its  proof  were  inspired  by  Patel-Schneider’s  (1985)  closely-related  First-Order  Entailment 
Undecidability  Theorem. 

1  That  RP-validity  and  T-validity  are  identical  is  the  special  case  of  this  theorem  when  kb  —  0 


This  result  is  surprising  considering  that  RP  does  not  sanction  modus  ponens  and  that  all 
sentences  are  RP-satisfiable.  What  went  wrong?  Why  hasn’t  the  no-chaining  restriction  led  to 
decidability  in  this  quantificational  logic?  These  questions  can  be  answered  by  observing  how 
this  undecidability  has  arisen  out  of  some  fundamental  properties  of  RP.  The  resulting  insights 
are  used  to  motivate  the  development  of  RQ. 

To  proceed  with  the  analysis  some  of  our  tools  need  to  be  generalized.  To  begin  with,  the 
notion  of  entailment  needs  to  be  generalized  to  allow  one  set  of  sentences,  A ,  to  entail  another 
set,  B.  As  before,  A  hi?  means  that  no  model  satisfies  A  and  falsifies  B,  and  as  before,  a 
model  satisfies  a  set  of  sentences  iff  it  satisfies  each  member.  Additionally,  we  now  say  that  a 
model  falsifies  a  set  of  sentences  iff  it  falsifies  each  member. 

Since  entailment  is  now  a  relationship  between  sets  of  sentences,  it  is  worth  generalizing 
the  definition  of  "sequent"  to  allow  a  set  of  sentences  to  appear  in  its  consequent.  These  new 
sequents  are  called  generalized  sequents,  while  mundane  sequents  with  single-sentence  conse¬ 
quents  are  called  ordinary  sequents.  As  is  consistent  with  the  definition  of  entailment,  1  draw 
no  distinction  between  a  sequent  whose  consequent  is  a  single  sentence  and  one  whose  conse¬ 
quent  is  a  singleton  set.  Hence,  every  ordinary  sequent  is  also  a  generalized  sequent.  As  with 
antecedents,  set  signs  in  a  consequent  are  often  omitted.  Thus  I  often  write  "P ,  Q=>P ,  Q  '' 
instead  of  "{ P ,  Q  }=»{P,  Q  )",  and  "P,  Q  N  P,  Q”  instead  of  ”{P,  Q  }  H  {P,  Q  }”• 

Retrieval  problems  are  always  characterized  by  ordinary  sequents.  Non-ordinary  sequents 
arise  only  in  analyzing  retrieval.  The  primacy  of  ordinary  sequents  over  non-ordinary  sequents 
is  reflected  in  the  choice  of  adopting  the  convention  that  only  ordinary  sequents  are  considered 
unless  it  is  explicitly  stated  otherwise. 

Though  non-ordinary  sequents  do  not  arise  directly  in  characterizing  retrieval  problems, 
they  do  arise  in  relating  RP-entailment  for  FOPC  sequents  to  that  for  QFPC  sequents.  This 
relation  is  established  by  the  famous  Herbrand  Theorem.  Though  originally  formulated  for  T, 
it  holds  equally  well  for  many  other  logics,  including  RP.  A  logic  for  which  the  Herbrand 
Theorem  holds  is  said  to  have  the  Herbrand  Property.  The  Herbrand  Theorem  concerns 
sequents  in  a  normal  form  known  as  Skolem  Normal  Form  (or  SNF). 

Definition:  Skolem  Normal  Form 

A  generalized  sequent  of  FOPC  is  in  Skolem  Normal  Form  (SNF)  iff  all  sentences  in  its  an¬ 
tecedent  are  in  universal  prenex  form  and  all  sentences  in  its  consequent  are  in  existential 
prenex  form. 


Herbrand  Theorem3 

Let  kb=*Q  be  a  generalized  SNF  sequent  of  FOPC.  Then,  kb  entails  Q  iff  kbgr  entails  Qgr. 

For  T,  both  the  Herbrand  Theorem  and  its  proof  are  widely  published.4  The  proof,  which 
is  fairly  general,  is  a  product  of  the  Herbrand  Model  Lemma  and  the  use  of  the  standard 
interpretation  for  quantifiers.  Thus  the  proof  for  T  can  be  used  directly  to  prove  the  theorem 
for  RP. 

The  Herbrand  Theorem  for  RP  relates  RP-entailment  for  SNF  sequents  of  FOPC  to  RP- 
entailment  for  sequents  of  QFPC.  As  such,  the  theorem  provides  a  way  of  eliminating  universal 
quantifiers  from  an  antecedent  and  existentials  from  a  consequent  when  considering  questions  of 
RP-entailment  for  SNF  sequents.  As  an  example  of  this  theorem  in  action  consider  a  language 
whose  lexicon  contains  only  the  zero-place  function  symbols  "a”  and  "6."  The  Herbrand 
Theorem  reduces  the  question  of  whether  (4.3)  holds  to  the  question  of  whether  (4.4)  holds. 

VxP{x)\=RPP{a)\/P(b)  (4.3) 

P(a),P(b)\=RpP(a)VP(b)  (4.4) 

The  results  of  Chapter  2  bear  on  this  latter  implication.  According  to  the  RP-Decision 
Theorem  for  Facts,  (4.4)  holds,  and,  according  to  the  GSRA  Correctness  Theorem,  the  Ground 
Sequent  Retrieval  Algorithm  could  be  used  to  determine  this. 

In  stark  contrast  to  universal  quantifiers  in  an  antecedent,  the  occurrence  of  existentials  in 
a  consequent  is  extremely  problematic.  Observe  that  the  Herbrand  Theorem  reduces  the  ques¬ 
tion  of  whether  (4.5)  holds  to  the  question  of  whether  4.6  holds. 

P(a)\/P(b)\=RP3xP(x)  (4.5) 

P(a)WP(b)hFP(a),P(b)  (4.6) 

However,  since  (4.6)  involves  a  generalized  sequent,  the  results  of  Chapter  2,  which  only  concern 
ordinary  sequents,  cannot  be  brought  to  bear.  The  GSRA  can’t  be  used  to  decide  whether  this 
implication  holds.  The  No-Chaining  Theorem  doesn’t  apply. 

Nonetheless,  (4.6)  does  indeed  hold.  Any  model  that  satisfies  P(a)vP(6)  must  satisfy 
either  P(a)  or  P(b).  In  either  case,  the  model  does  not  falsify  {P(a),  P(6)}.  This  example 
illustrates  that  RP  sanctions  chaining  among  the  elements  of  the  consequent  of  a  generalized 
sequent.  Consequently,  to  respond  to  an  existentially  quantified  query,  a  retriever  specified  by 

’  The  usual  statement  of  the  Herbrand  Theorem  contains  a  claim  about  compactness  That  aspect  of  the 
theorem  is  omitted  from  the  present  statement  since  it  is  not  used  in  this  work 

4  Once  again  Loveland  (1978)  and  Robinson  (1979)  provide  good  expositions 
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RP  may  have  to  chain  together  multiple  instances  of  the  query.  Moreover,  there  is  no  bound  on 
the  number  of  instances  that  may  need  to  be  chained  and  therefore  RP-validity  is  not  decid¬ 
able. 

This  example  of  chaining  within  a  consequent  is  rendered  benign  by  its  simplicity,  so  let  us 
consider  a  more-complex  example  where  many  instances  of  a  query  must  be  taken  into  account. 
Simply  observe  that  the  query 

3x  — >/(0)  V  [/(x)A-./(«(x))]  V  I(s{s(s(s( 0)))))  (4.7) 

is  RP-valid.  This  may  become  more  apparent  by  thinking  of  the  interpretation  in  which  "7" 
denotes  a  predicate  that  is  true  of  precisely  the  integers,  "s"  denotes  the  successor  function  and 
"0”  denotes  zero.  To  form  a  valid  (i.e.,  non-falsifiable)  set,  at  least  four  ground  instances  of 
(4.7)  are  required:  those  generated  by  the  substitutions  {0/z},  {s(0)/x},  {s(s(0))/x},  and 
{s(s(s(0)))/x }.  It  is  only  by  taking  these  four  instances  together  that  a  valid  set  can  be  formed 
and  it  is  because  of  this  that  chaining  enters. 


4.2.  RQ:  The  Logic  of  a  Retriever  for  a  Quantificational  Language 

The  RQ  model  theory  for  FOPC  is  developed  with  the  goal  of  endowing  the  retrievability 
relation  specified  by  RQ-entailment  with  two  properties.  Firstly,  over  the  sentences  of  QFPC 
the  retrievability  relations  specified  by  RQ  and  by  RP  should  be  identical.  In  this  sense  the 
retriever  specified  in  this  section  should  be  an  extension  of  the  retriever  specified  in  Chapter  2. 
Secondly,  the  retrievability  relation  specified  by  RQ  should  comply  with  the  no-chaining  res¬ 
triction  in  all  respects,  including  its  treatment  of  the  quantifiers. 

The  previous  section  points  out  that  it  is  not  sufficient  merely  to  prohibit  chaining  among 
the  facts  of  the  KB,  but  that  it  is  also  necessary  to  prohibit  it  among  instances  of  the  query.  In 
Chapter  2  the  development  of  a  logic  that  prohibits  chaining  among  the  facts  of  the  KB  was 
driven  by  an  attempt  to  define  a  logic  in  which  (4.8)  does  not  hold. 

P,-,PVQ\=Q  (4.8) 

In  the  same  vein,  this  section’s  development  of  a  logic  that  prohibits  chaining  among  the 
instances  of  a  query,  is  driven  by  an  attempt  to  define  a  logic  in  which  (4.9)  does  not  hold. 

P(a)\yP(b)  \=3xP{x)  (4.9) 

A  logic  does  not  sanction  the  chaining  of  instances  of  a  consequent  if  it  has  the  Strong  Her- 
brand  Property: 

For  any  kb=$Q,  a  generalized  Skolem  Normal  Form  sequent  of  FOPC,  kb 
entails  Q  iff  kbjr  entails  a  single  element  of  Qgr. 

Whereas  the  Herbrand  Property  can  reduce  entailment  with  quantifiers  to  propositional  entail- 
ment  for  generalized  sequents,  the  Strong  Herbrand  Property  can  reduce  it  to  propositional 
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entailment  for  ordinary  sequents.  In  a  logic  that  has  this  property  (4.9)  holds  only  if 
P(a)\/P{b)\=P{a)  or  P(a)vP{b)\=  P{b) 

(4.10)  does  not  hold  in  T  or  any  logic  weaker  than  it. 

The  Strong  Herbrand  Property  can  be  divided  into  a  quantificationa!  component  and  a 
propositional  component.  The  quantificational  component  is  the  Herbrand  Property.  It 
reduces  the  problem  of  deciding  whether  (4.9)  holds  to  the  question  of  whether  (4.11)  does. 

P(a)vP(b)t={P(a),P(b)}  (4.11) 

We  can  endow  RQ  with  the  Herbrand  Property  by  giving  quantifiers  their  standard  interpreta¬ 
tion.  The  propositional  component  of  the  Strong  Herbrand  Property  is  the  Minuteness  Pro¬ 
perty. 

A  set  of  sentences  entails  a  second  set  iff  it  entails  a  single  element  of  the 
second  set. 

This  property  reduces  the  problem  of  deciding  whether  (4.11)  holds  to  the  problem  of  deciding 
whether  (4.10)  does. 

Now  consider  how  RQ  can  be  defined  so  as  to  possess  the  Minuteness  Property.  If  (4.11) 
does  not  hold  then  there  must  be  a  model  that  satisfies  P(a)vP(b)  but  falsifies  both  P(a )  and 
P(b).  That  a  disjunction  can  be  satisfied  when  its  disjuncts  aren’t  is  reminiscent  of  modal  logic 
where  P(a)\/P(b)  may  be  necessarily  true  in  spite  of  the  non-necessity  of  P(a )  and  of  P(b). 
(Symbolically,  n(P(a)vP(6))  M  □P(a),  DP(b).)  This  insight  motivates  the  possible-worlds 
style  definition  of  RQ,  to  which  we  now  turn. 

Calling  an  RP-model  a  3-setup,5  an  RQ-model  is  defined  simply  as  a  compatible  set  of  3- 
setups.  3-setups  assign  truth  values  to  quantifier-free  formulas  in  the  same  manner  as  RP 
models.  The  valuation  associated  with  an  RQ-model  is  defined  in  terms  of  the  valuations  asso¬ 
ciated  with  its  3-setups.  Speaking  very  loosely,  an  RQ-model  assigns  a  quantifier-free  formula 
True  if  the  formula  is  necessarily  True,  and  False  if  it  is  possibly  False.  Quantifiers  are  inter¬ 
preted  in  the  standard  way.  (4.12),  (4.13)  and  (4.14)  define  how  prenex  formulas  are  assigned 
truth  values  relative  to  a  value  assignment  e  and  an  RQ-model  M  whose  common  domain  is  D . 


True  €  iff  for  all  d  €  D ,  True  €  H"'*1'7*1 

False  £  [{Vza]]M’e  iff  for  some  d  ED ,  False  6 

(4.12) 

True  (E  jJ3xa]]W'e  iff  for  some  d  ,  True  6 

False  ePxaJ]M'e  iff  for  ail  d  £D,  False  €  H]A''e!‘</l1 

(4.13) 

s  This  term  derives  from  Belnap’s  use  of  the  word  "setup"  to  denote  a  4-valued  assignment. 
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and  if  a  is  quantifier-free 

True  £  [Ja]]W’e  iff  for  every  3-setup  s  £ M,  True  £  jja]]'’<  (4-14) 

False  6  (JaJjw,e  iff  for  some  3-setup  s  £A/,  False  £  [[o;l[]*’e 

Regarding  these  semantic  equations,  several  points  are  noteworthy.  First  of  all  (4.12)  and 
(4.13)  merely  reiterate  (4.1)  and  (4.2),  and  hence  quantifiers  receive  their  standard  interpreta¬ 
tion.  Secondly,  (4.12)— (4. 14)  do  not  assign  values  to  non-prenex  form  sentences.  Finally,  to 
every  prenex  formula,  a  model  and  a  value  assignment  assign  a  legitimate  truth  value,  that  is,  a 
non-empty  subset  of  {True,  False}. 

The  valuation  associated  with  an  RQ  model  that  contains  only  one  3-setup  is  identical  to 
the  valuation  associated  with  that  3-setup.  Hence  every  RP-valuation  is  an  RQ-valuation  and 
therefore  RQ-entailment  is  weaker  than  RP-entailment. 

Furthermore,  RQ  can  generate  valuations  that  RP  does  not.  Among  them  are  those 
valuations  generated  by  countermodels  to  (4.9)  and  (4.11).  Consider  M,  a  model  containing  two 
3-setups,  sx  and  s2>  which  have  the  common  Herbrand  domain  {a,b  }.  Supposing  that  these  3- 
setups  assign  truth  values  to  the  atomic  sentences  as  shown  in  (4.15),  then  they  must  also  make 
the  assignments  shown  in  (4.16)  and  M  must  make  the  assignments  shown  in  (4.17). 


flF’MlP  =  {True} 

UP(a)D*’  =  {False} 

(4.15) 

HP(6)]p  -  {False} 

JP(6)  ]]**  =  {True} 

ttP(a)vP(6)E,1  =  {True} 

[[P(a)vP(6)l'J  =  {True} 

(4.16) 

dP(a)lM  =  {False} 

|P(6)I1M  -  {False} 
|P(a)VP(4)r  =  {True} 

[[Bi  Px\M  ~  {False} 

(4.17) 

By  satisfying  P(a)vP(b)  and  falsifying  both  3xP(x)  and  {P(a),P(b)},  M  demonstrates  that,  as 
desired,  neither  (4.9)  nor  (4.11)  hold  in  RQ. 

4.3.  Properties  of  RQ 

In  a  straightforward  manner,  this  section  proves  that  RQ  does  indeed  have  the  properties 
that  led  to  its  design:  that  and  b^  completely  agree  on  the  ordinary  sequents  of  QFPC, 
and  that  in  virtue  of  having  the  Minuteness  Property  and  the  Herbrand  Property,  RQ  has  the 
Strong  Herbrand  Property. 

First  consider  the  Herbrand  Theorem  for  RQ.  Since  RQ-models  are  composed  of  RP 
models,  the  Herbrand  Model  Lemma  holds  in  RQ  as  well  as  in  RP.  Because  of  this  and  HQ's 


use  of  the  standard  interpretation  for  quantifiers,  the  Herbrand  Theorem  also  holds  in  RQ. 
Now  consider  the  RP-RQ  Equivalence  Theorem  and  the  Minuteness  Theorem. 

RP-RQ  Equivalence  Theorem 

An  ordinary  sequent  of  QFPC  is  in  \^P  iff  it  is  in  bj^  . 

Proof 

if  clause:  Trivial  since  every  RP-model  is  an  RQ-model. 

only-if  clause:  Let  kb=*b  be  an  ordinary  sequent  of  QFPC.  Assuming  that  RQ-model  MRq 
satisfies  kb  and  falsifies  q,  I  show  that  some  3-setup  in  MRq  satisfies  kb  and  falsifies  q.  Since 
MRq  falsifies  q,  some  3-setup  s  £MRq  falsifies  q.  Since  MRq  satisfies  kb,  so  does  every  3-setup 
in  Mrq.  Therefore  some  3-setup  in  MRq  satisfies  kb  and  falsifies  q.  ■ 

Minuteness  Theorem  (a.k.a.  the  Mlaotentn  Theorem) 

For  any  kb=>Q,  a  generalized  sequent  of  QFPC,  kb  bj^  Q  iff  for  some  q  £Q,  kb  bj^  q. 

Proof 

if  clause:  Obvious. 

only-if  clause:  I  show  that  if  kb  q  for  every  q  £  Q  then  kb  ^RQ  Q.  I  do  this  by  assuming 
that  for  every  q£Q  there  is  a  Herbrand  model  Mq  that  satisfies  kb  and  falsifies  q,  and  con¬ 
structing  a  Herbrand  model  M*  that  satisfies  kb  and  falsifies  Q.  The  Herbrand  Lemma  justifies 
my  restricted  attention  to  Herbrand  models.  Let  M  be  the  union  of  every  M?.  Since  they  are 
Herbrand  3-setups,  the  3-setups  in  M  are  compatible  and  therefore  M  is  an  RQ-model. 
Since  each  3-setup  s£M  is  in  some  M?  and  Mq  satisfies  kb,  s  also  satisfies  kb.  Hence  M 
satisfies  kb.  Furthermore,  every  q  £  Q  is  falsified  by  some  3-setup  in  M?,  and  hence  by  some 
3-setup  in  M  .  Therefore  M  falsifies  Q.  ■ 

The  Herbrand  Theorem,  the  Minuteness  Theorem  and  the  RP-RQ  Equivalence  Theorem 
state  the  most  fundamental  properties  of  RQ.  These  results  dovetail  together  to  form  compo¬ 
site  results  in  a  manner  suggestive  of  the  way  that  sentences  combine  to  form  paragraphs.  The 
Herbrand  Theorem  relates  bj^  for  FOPC  sequents  to  bjj^  for  generalized  QFPC  sequents, 
which  the  Minuteness  Theorem  relates  to  bj?g  for  ordinary  sequents,  which  the  RP-RQ 
Equivalence  Theorem  relates  to  (7^  for  ordinary  sequents.  RP-entailmcnt  for  ordinary  QFPC 
sequents  is  well-studied  in  Chapter  2.  Results  from  that  chapter,  such  as  the  No-Chaining 
Theorem,  can  be  dovetailed  onto  the  end  of  the  above  sequence  of  results. 

Of  these  composite  results,  I  now  present  two  of  the  most  interesting:  the  Strong  Her¬ 
brand  Theorem  and  the  Generalized  No  Chaining  Theorem. 
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Strong  Herbrand  Theorem 

Let  kb=$Q  be  a  generalized  SNF  sequent  of  FOPC.  Then  kb  Q  iff  kb}r  RQ-entails  some 
element  of  Qlr. 

Proof 

Follows  immediately  from  the  Herbrand  Theorem  for  RQ  and  the  Minuteness  Theorem.  ■ 


Generalized  No— Chaining  Theorem 

Let  be  a  set  of  QFPC  sentences  and  A  be  a  set  of  facts.  Then  $  A  iff  for  some  </>  and 
a  (E.A,  <t>  a. 


Proof 

$  \^q  A  iff  for  some  a  £A ,  4>  a 
iff  for  some  a  £  A ,  f^p  a 
iff  for  some  a  €  A  and  <0  £  d>,  H^p  a 
iff  for  some  aEA  and  4>  £  $,  <f>  a 


(Minuteness  Theorem) 

(PT  -RQ  Equivalence  Theorem) 
(No-Chaining  Theorem  for  RP) 
(RP-RQ  Equivalence  Theorem)  ■ 


4.4.  Computing  Retrievability 

This  section  examines  how  the  retrievability  relation  specified  by  RQ  can  be  decided.  As 
in  previous  chapters  the  use  of  a  normal  form  divides  the  presentation  in  two.  The  first  part 
presents  the  Skolem  Normal  Transformation,  which  transforms  sequents  into  Skolem  Normal 
Form.  The  second  part  demonstrates  that  by  treating  quantified  variables  schematically,  the 
Schematic  Sequent  Retrieval  Algorithm  can  be  used  to  decide  whether  an  SNF  sequent  is  in 
• 

The  Skolem  Normal  Transformation  (SNT)  defined  below  maps  an  arbitrary  sequent  into 
an  SNF  sequent  in  such  a  way  that,  as  stated  by  the  Skolem  Normal  Transformation  Theorem, 
its  input  is  in  q  iff  its  output  is.  The  SNT  Theorem  for  T  is  well-published.  Since  its  proof 
applies  to  RQ  as  well  as  to  T,  the  theorem  is  stated  without  proof. 


Definition:  Skolem  Normal  Transformation 

Let  kb=$q  be  an  FOPC  sequent.  Its  Skolem  Normal  Transform  is  computed  by  the  following 
steps. 

(1)  While  kb  contains  a  sentence  ^  that  contains  an  existential  quantifier  do: 

Observe  that  <j>  is  of  the  form 

V*1  Vz2  •  •  •  Vx„  3y  0[y] 

for  some  n>0  and  prenex-form  formula  rp[y}.  Choose  £,  some  n-ary  function  symbol 
that  does  not  occur  in  kb=$q  and  replace  <^’s  occurrence  in  kb=$q  with 

VxjVz-j  •  •  •  Vzn  .  .  .  ,*„)] 

(2)  While  q  contains  a  universal  quantifier  do: 

Observe  that  q  is  of  the  form 

3Xl3x2  ■  ■  ■  3xnVyx p[y] 

for  some  n> 0  and  prenex-form  formula  Choose  £,  some  n-ary  function  symbol 

that  does  not  occur  in  kb=*q  and  replace  q' s  occurrence  in  kb=$q  with 

3Xl3x2  •  •  •  3 xntl>[t(xv  . .  .  .*„)] 

Skolem  Normal  Transformation  Theorem 

A  generalized  sequent  of  FOPC  is  in  iff  its  Skolem  Normal  Transform  is. 

It  should  be  noted  that  unlike  the  CNT,  the  SNT  is  not  a  transformation  on  sentences;  it 
only  applies  to  sequents  as  a  whole.  The  SNT  does  not  replace  sentences  with  their  equivalents; 
it  merely  preserves  ,  which  is  all  we  are  currently  interested  in. 

Once  a  sequent  is  in  SNF  the  Schematic  Sequent  Retrieval  Algorithm  can  decide  if  it  is  in 
r =rq  ■  Consider  the  SNF  sequent  of  FOPC  kb^tq  and  the  schematic  sequent  kb'=*q'  that  is 
obtained  by  dropping  its  quantifiers  and  renaming  its  logical  variables  to  schematic  variables. 
From  Chapter  3  we  know  that  the  SSRA  can  decide  whether  kbgr'  RP-entails  some  instance  of 
q'.  Therefore,  because  kbgr—kbgT\  qgr  =  qgr',  and  RP  and  RQ  are  equivalent  for  ordinary 
sequents  of  QFPC,  the  SSRA  also  decides  whether  kbgr  RQ-entails  some  instance  of  q.  Finally, 
according  to  the  Strong  Herbrand  Theorem,  deciding  whether  this  last  implication  holds  is 
equivalent  to  deciding  whether  kb  q. 

FOPC  Retrieval  Theorem 

Let  kb=*q  be  an  ordinary  SNF  sequent  of  FOPC  and  let  kb'=>q'  be  the  schematic  sequent,  that 
results  from  removing  all  quantifiers  from  kb=>q.  Then  kb  q  iff  kb'h^j^  q'. 


Proof 

kb  q  iff  for  some  qg  Eqgr  kbgr  qg  (Strong  Herbrand  Theorem) 

iff  for  some  qg'£qgr'  kbgr 1  qg1  (Since  kbgt  =kbgr'  and  qgt  =qgr') 

iff  for  some  qg'€.qtr'  kbgr'  f^p  qg1  (RP-RQ  Equivalence  Theorem) 

iff  q1  is  retrievable  from  kb1  (Def.  of  Schematic  Retrievability) 

iff  q'  (SSRA  Correctness  Theorem)  ■ 


Chapter  5 


A  Retriever  that  Reasons  about  Taxonomies 

This  chapter  develops  a  retriever  that  augments  the  no-chaining  retriever  of  the  previous 
chapter  with  the  capability  to  chain  in  certain  highly-constrained  circumstances,  namely  when 
reasoning  about  taxonomies  and  when  performing  inheritance.  This  kind  of  inference  has  been 
performed  by  most  semantic-network  systems. 

Let  us  consider  an  example  that  illustrates  what  I  mean  by  "reasoning  about  taxonomies" 
and  by  "inheritance."  Let  kb  be  the  knowledge  base  containing  sentences  (5.1)-(5.4). 


Mustang  [Olde -Black)  (5.1) 

Vz  Mustang  (x)— *  Auto  (x)  (5.2) 

Vx  Auto (x)—r  Vehicle (x)  (5-3) 

Vz  Mustang(x)—*Built(Ford,x)  (3-4) 

Among  the  sentences  T-entailed  by  kb  are 

Vz  Mustang (z)— *•  Vehicle (z)  (5.5) 

Auto  [Olde -Black)  (5.6) 

Built  [Ford, Olde -Black)  (5.7) 


The  derivations  of  (5.5)  and  (5.6)  from  kb  each  exemplify  what  1  have  in  mind  when  I  speak  of 
reasoning  about  a  taxonomy  while  the  derivation  of  (5.7)  exemplifies  inheritance. 

The  retriever  of  Chapter  4  performs  none  of  these  inferences;  kb  RQ-entails  neither  (5.5), 
(5.6),  nor  (5.7).  If  this  is  not  clear  then  observe  that  any  model  M  that  has  a  domain  of  three 
elements — denoted  by  "Olde -Black"  "Ford"  and  "  Touring  -Machine" — and  that  makes  the  fol¬ 
lowing  truth  assignments  satisfies  each  of  (5.1)-(5.4)  but  falsifies  each  of  (5.5)-(5.7). 

[[  Mustang  ( Olde  -Black  )JW={True,  F  alse } 

[J Mustang ( T ouring  -Machine  )| M  =  { True } 

^Mustang  (Ford)^M  =  {F  alse} 

\Auto{Olde  -Black)\M  ={False} 

[jAufo(  Touring  -Mac/ime)]]W  =  {True,  False) 

\Auto{Ford  )J]M  =  (False) 

|  Vehicle  (T ouring  - Machine )JAf  =  (False  j 
d  Vehicle  (Ford  )JM  =  (False  | 


lBuilt{Ford,Olde-Black)lM  =  {False} 

[| Built  {Ford ,  Touring -Machine  )JjM  =  {False} 


The  goal  of  this  chapter  is  to  extend  RQ  so  that  it  can  reason  about  taxonomies  and  per¬ 
form  inheritance  without  performing  any  other  kind  of  chaining.  A  difficulty  arises  because  cer¬ 
tain  desirable  inferences — such  as  those  exemplified  above — are  instances  of  the  application  of 
modus  ponens,  an  inference  rule  that  the  retriever  should  not  use  indiscriminately.  How  then 
can  a  specification  of  retrieval  distinguish  the  desirable  forms  of  chaining  from  the  undesirable 
forms? 

The  way  out  of  this  predicament  is  to  follow  a  strategy  often  employed  in  the  construction 
of  semantic-network  systems.  In  such  systems  specialized  notation  is  used  to  encode  certain 
kinds  of  information  and  specialized  inference  mechanisms  are  then  used  to  deal  with  that  nota¬ 
tion.  Typically,  special  nodes  and  links  are  used  to  encode  taxonomic  information.  Accord¬ 
ingly,  the  retriever  specified  in  this  chapter  operates  on  a  language,  called  the  Sorted  First- 
Order  Predicate  Calculus  (SFOPC),  that  extends  FOPC  with  special  notation  for  representing 
categories  and  for  expressing  that  members  of  a  category  have  particular  properties. 

5.1.  The  Sorted  First-Order  Predicate  Calculus 

We  now  turn  to  the  definition  of  the  Sorted  First-Order  Predicate  Calculus.  Since 
SFOPC  has  some  of  the  features  of  a  traditional  sorted  logic,  I  henceforth  use  the  terminology 
of  that  field  rather  than  more  general  terms  such  as  "taxonomy.”  Roughly  speaking  a  sort  is  a 
taxonomic  category  and  a  sort  symbol  denotes  a  sort. 

In  addition  to  the  usual  function  and  predicate  symbols,  the  SFOPC  lexicon  contains  a 
countable  set  of  sort  symbols.  Typographically,  sort  symbols  are  written  entirely  in  upper¬ 
case.  Semantically,  a  sort  symbol,  like  a  monadic  predicate,  denotes  a  subset  of  the  domain. 

In  addition  to  the  ordinary  kind  of  variables,  SFOPC  has  restricted  variables.  A  restricted 
variable  is  a  pair,  x:t,  where  x  is  a  variable  name  and  t,  often  referred  to  as  a  restriction,  is  a 
finite  set  of  sort  symbols.  Henceforth,  the  term  "variable"  refers  generally  to  either  an  ordinary 
variable  or  a  restricted  variable. 

A  variable  whose  restriction  is  {SltS2,  .  .  .  ,SB}  is  written  as  x:SvS2,  .  .  .  ,Sn.  The  order 
of  the  sort  symbols  is  irrelevant,  and  thus  x:S1,S2  and  x:S2,S1  are  one  and  the  same  variable. 
For  clarity,  variables  are  often  written  in  angle  brackets,  such  as  ( x:Sl,S2,S3 ). 

To  avoid  confusion  I  never  write  a  formula  containing  two  distinct  variables  that  have  the 
same  variable  name.  That  is,  no  formula  contains  variables  ( x:t )  and  {xir1)  where  r  and  r'  are 
distinct.  This  enables  use  of  the  following  shorthand.  If  a  formula  has  multiple  occurrences  of 
the  same  variable  then  the  restrictions  often  are  written  on  only  the  first  occurrence.  For 
example,  (5.8)  can  be  abbreviated  as  (5.9). 


Vz:S  P(x:S)VQ{z:S) 
Vi :S  P{x)\fQ{x) 
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!5>l 


r  and  u>  are  meta-linguistic  symbols  that  always  stand  for  restrictions,  (i :r,,r2 . rn) 

is  a  variable  whose  restriction  contains  precisely  the  sorts  j-j •  •  •  (jrft. 

Put  crudely,  a  Tarskian  model  for  SFOPC  is  a  Tarskian  FOPC  model  to  w  hich  an  assign¬ 
ment  to  sort  symbols  has  been  grafted.1  More  precisely,  a  T-model  for  SFOPC  is  a  pair  (M  ,A  s) 
where  M  is  a  Tarskian  FOPC  model  and  As  is  a  sort  assignment ,  a  function  that  maps  each 
sort  symbol  to  a  subset  of  the  domain  of  M. 

A  sort  symbol  denotes  the  set  of  individuals  that  As  assigns  to  it.  A  restriction  also 
denotes  a  sort,  the  intersection  of  the  sorts  denoted  by  the  sort  symbols  in  the  restriction. 
Therefore,  if  M  =(M',AS)  is  a  T-model  for  SFOPC,  and  S’  is  a  sort  symbol,  and 
r  =  {S'1,  .  .  .  ,Sn}  is  a  restriction,  then: 

flSfl*'1'  =As{S) 

A  restricted  variable  only  ranges  over  the  subset  of  the  domain  denoted  by  its  restriction. 
Formally  this  is  captured  by  the  following  semantic  rules  for  quantifiers  with  restricted  vari¬ 
ables: 

TrueC  [Vi:r  4>\M‘e  iff  for  every  d  £  [[r]]M,e,  TrueC  (5.10) 

False  £  [[Vz:r  iff  for  some  d  £  [[rj]M'r,  False  £  '* 'd  ‘ 1  • 

TrueC  [J3i:r  iff  for  some  d  £  TrueC  (5.1 1) 

False  £  J3r:r  iff  for  every  d  £  False  £ 

Notice  that  if  S  is  a  sort  symbol  that  denotes  the  entire  domain  in  some  model  M,  then 
[{V(z:5)  =  [JVz  and  (J3{i:5) =  [{3i  Consequently,  unsorted  variables 

are  often  treated  as  sorted  variables  implicitly  restricted  to  the  "universal"  sort.  Also  notice 
that  if  S  denotes  the  empty  set  in  some  model,  then  ‘hat  model  assigns  {True!  to  V(i :S)6  and 
(False)  to  3{x:S}d>. 

Now  that  quantifiers  can  be  restricted  to  range  over  subsets  of  the  domain  we  need  a  way 
to  express  relationships  among  these  subsets.  To  do  this,  SFOPC  is  endowed  with  a  special  set 
of  formulas,  which  are  called  S-formulas  to  distinguish  them  from  the  previous  formulas,  which 
are  called  A-formuIas.  S-formulas  are  constructed  like  ordinary  formulas  of  FOPC  except  that 
they  contain  no  ordinary  predicate  symbols;  in  their  place  are  sort  symbols  acting  as  monadic 
predicate  symbols.  Hence,  every  atomic  S-formula  is  of  the  form  S(t),  where  5  is  a  sort  sym¬ 
bol  and  t  is  an  ordinary  term.  In  the  obvious  way  1  use  the  terms  S-sentence  and  51  literal. 


S-formulas  are  assigned  truth  values  as  one  would  expect:  an  atomic  formula  S(t)  is 
assigned  True  if  the  domain  element  denoted  by  t  is  a  member  of  the  set  denoted  by  5,  and  a 
molecular  S-formula  is  assigned  a  value  in  the  usual  Tarskian  manner. 

SFOPC  is  no  more  expressive  than  FOPC;  each  sentence  of  SFOPC  is  T-equivalent  to  one 
(of  about  the  same  length)  of  FOPC.  Clearly  the  addition  of  sort  symbols  does  not  make  the 
language  more  expressive  since  they  behave  semantically  like  monadic  predicate  symbols.  Nor 
does  the  addition  of  restricted  variables  enhance  the  expressiveness  of  the  language.  To  see 
this,  observe  that  if  r  is  the  restriction  •  •  •  ,  Sn}  and  e  is  a  variable  assignment  that 

maps  i  to  d,  then 

de  HW’e  iff(IS'1(r)A52(x)A  •  •  *  ASn(x)r'e=True 

Because  of  this  relationship  I  henceforth  abbreviate  the  formula  Sl(t)AS2(t)A  •  •  •  AS'n(<)  as 
r(t).  Finally,  observe  that  any  formula  containing  restricted  quantifiers  can  be  rewritten  to  a 
T-equivalent  one  without  restricted  quantifiers  on  the  basis  of  these  equivalences: 

Vx:r  t 1)\x:t\  =t  Vx  t(x)—+  ip[x] 

3x:t  4>\x:t\  =t  3x  t(x) Ai/'fx] 

The  formula  that  results  from  removing  all  restricted  quantifiers  from  a  formula  <?  by  this 
rewriting  process  is  called  the  normalization  of  <j>  and  is  denoted  by  <t>N .  If  d>  is  a  set  of  formu¬ 
las,  then  4>n  =  \<pN  \<j>  £4>|. 

A  KB  now  consists  of  a  set  of  SFOPC  sentences  and  a  query  always  specifies  an  SFOPC 
sentence  to  be  retrieved.  It  is  often  convenient  to  consider  a  KB  as  consisting  of  two  com¬ 
ponents,  an  AKB  containing  all  the  A-sentences  in  the  KB,  and  an  SKB  containing  all  the  S- 
sentences  in  the  KB.  Sequents  of  SFOPC  are  often  written  in  a  form  that  exhibits  the  distinc¬ 
tion  between  S-sentences  and  A-sentences.  Namely,  in  an  SFOPC  sequent  written  in  the  form 
T,,akb=>q ,  the  antecedent  is  divided  into  a  set  of  S-sentences,  E,  and  a  set  of  A-sentences,  akb . 
Similarly,  entailments  are  written  in  the  form  T,,akb  [  q. 

As  before,  only  A-sentences  in  prenex  form  are  considered  and  therefore  "A-sentence"  only 
refers  to  prenex  form  A-sentences.  As  this  chapter  progresses  various  restrictions  are  placed  on 
the  SKB. 

It  is  worth  noting  that  the  SKB  is  a  theory  of  a  taxonomy,  not  a  taxonomy. 

5.2.  RT:  The  Logic  of  a  Taxonomic  Retriever 

This  section  defines  RT,  a  model  theory  for  SFOPC’  whose  entailment  relation  serves  as  a 
retrievability  relation.  RT  extends  RQ  to  handle  the  syntactic  extensions  introduced  in  the  last 
section.  Recall  that  the  resulting  retrievability  relation  is  to  respect  the  no  chaining  restric¬ 
tion,  except  that  it  is  to  reason  completely  with  taxonomic  information.  Thus  the  retriever 


should  perform  all  taxonomic  inferences  sanctioned  by  the  Tarskian  semantics.  We  will  see 
shortly  that  this  is  a  simple  notion  to  capture  in  a  model  theory. 

Since  the  retriever  is  to  reason  fully  about  sorts,  the  interpretation  of  sort  symbols — 
unlike  the  interpretation  of  predicate  symbols — should  not  be  weakened.  So  a  sort  symbol 
should  be  given  its  full  Tarskian  meaning.  The  definitions  that  follow  are  made  with  this  in 
mind. 

Just  as  a  Tarskian  SFOPC  model  is  formed  from  a  Tarskian  FOPC  model  by  appending  a 
sort  assignment  to  it,  so  an  RT-model  is  formed  from  an  RQ  model.  Thus,  an  RT-model  is 
merely  a  pair  consisting  of  an  RQ-model  and  a  sort  assignment.  An  alternate  view  is  that  RT 
relaxes  the  Tarskian  model  theory  for  SFOPC  in  the  same  way  that  RQ  relaxes  the  Tarskian 
model  theory  for  FOPC. 

An  RT-model,  M  =  (M',AS),  assigns  the  same  semantic  values  to  sort  symbols  and  S- 
formulas  as  does  any  T-model  whose  domain  is  the  common  domain  of  M'  and  whose  sort 
assignment  is  As .  To  prenex-form  A-formulas  containing  no  restricted  quantifiers,  M  assigns 
assigns  truth  values  in  the  same  manner  as  M1.  Hence,  equations  (4.12)-(4.14)  describe  how  RT 
assigns  truth  values  to  such  A-formulas.  Restricted  quantifiers  are  treated  in  RT  as  in  T — that 
is,  according  to  (5.10)  and  (5.11).  Thus,  ignoring  unrestricted  quantifiers,  the  semantic  equa¬ 
tions  for  RT  are: 

True  6  RVzrr  iff  for  every  d  £  flr| M'e ,  True  £  (5.12) 

False  £  |Vx:7-  iff  for  some  d  False  £[)</>]] 

True  £  [J3z:r  cff’'  iff  for  some  d£[|rJM'',  True  £  (5.13) 

False  £  [[3x:r  d>]]W,e  iff  for  every  d  £  False  £ 

and  if  a  is  quantifier-free 

True  £  [[a]]W,e  iff  for  every  3-setup  s  £A/',  True£  |Ja]],,e  (5-14) 

False  £  [Ja]]W’e  iff  for  some  3-setup  s  £A/(,  False£  [ja]],'e 

Notice  that,  as  in  RQ,  these  equations  do  not  assign  values  to  non-prenex  form  A-sentences. 

In  contrast  to  T,  RT  treats  sort  symbols  and  monadic  predicate  symbols  differently. 
Whereas  each  monadic  predicate  is  mapped  to  a  function  from  the  domain  to  the  set  of  three 
truth  values,  each  sort  symbol  is  mapped  to  a  subset  of  the  domain — or,  equivalently,  to  a  func¬ 
tion  from  the  domain  to  (True,  False).  Thus,  S-formulas  operate  in  a  two-valued  logic,  A- 

iormulas  without  restricted  quantifiers  operate  in  a  three-valued  logic,  and  A-formulas  with 
restricted  quantifiers  operate  in  both.  The  logic  sanctions  chaining  in  those  parts  of  the 
language  that  operate  in  two  values  but  not  in  those  parts  that  operate  in  three  values. 

In  order  to  observe  some  chaining  that  RT  sanctions,  let  us  return  to  the  simple  taxo¬ 
nomic  inferences  considered  at  the  beginning  of  this  chapter.  First  observe  that  RT  agrees  with 


RQ  that  none  of  (5.5)-(5.7)  is  entailed  by  kb,  the  KB  containing  sentences  (5.1)-(5.4).  However, 
if  all  of  the  sentences  are  written  using  MUSTANG ,  AUTO,  and  VEHICLE  as  sort  symbols, 
then  the  entailments  do  obtain.  (5.1)-(5.4)  can  be  written  (T-equivalently)  in  SFOPC  as  three 


S-sentences  and  one  A-sentence: 

MUSTANG{Olde  -Black)  (5.1') 

Vx  MUSTANG  (x)-*  AUTO  {x)  (5.2') 

Wx  AUTO {x)->- VEHICLE {x)  (5.3') 

Vx:MUSTANG  Built(Ford,x)  (5.4') 

Furthermore  (5.5)  and  (5.6)  can  be  rewritten  (T-equivalently)  in  SFOPC  as  two  S-sentences: 

Vx  MUSTANG{x)^VEHICLE{x)  (5.5') 

AUTO{Olde -Black)  (5.6') 


And  now,  (5.1')-(5.4')  RT-entail  (5.5'),  (5.6'),  and  (5.7).  In  particular,  (5.2')  and  (5.3')  together 
RT-entail  (5.5'),  and  (5.1')  and  (5.2')  together  RT-entail  (5.6');  both  of  these  entailments 
operate  entirely  within  two-valued  Tarskian  logic  and  obtain  for  the  usual  reasons.  Addition¬ 
ally,  (5.1')  and  (5.4')  together  RT-entail  (5.7),  but  not  as  obviously.  Consider  any  variable 
assignment  e,  and  any  model  M  that  satisfies  both  (5.1')  and  (5.4').  Let  OB  be  [{O/de- 
Black^M’*.  Since  M  satisfies  (5.1'),  OB  £  \Mustang^M,e .  M  also  satisfies  (5.4'),  and  thus  by 
equation  (5.12),  True  C  [j£ut7l(ford,x)flM’'lOB//*l  1  Consequently,  True  G  \Built[Ford,Olde  - 
Black)  |M,e.  That  is,  M  satisfies  "  Built  (Ford  ,Olde  -Black)." 

5.3.  An  Approach  to  Computing  with  Restricted  Quantifiers 

The  primary  goal  of  the  remainder  of  this  chapter  is  to  develop  an  algorithm  that  imple¬ 
ments  the  retriever  specified  in  the  previous  section — that  is,  a  procedure  for  deciding  whether 
any  given  ordinary  finite  SFOPC  sequent  is  in  the  RT-entailment  relation.  However,  before 
proceeding  with  the  details  of  the  technical  developments  it  is  worth  pausing  to  overview  the 
structure  of  these  developments.  In  order  to  present  a  clear  picture,  this  overview  omits  certain 
secondary,  though  important,  points. 

The  approach  used  to  develop  a  retrieval  algorithm  that  operates  on  a  language  with  res¬ 
tricted  quantifiers  and  a  sort  theory  mimics  the  approach  used  in  Chapters  3  and  4  to  develop  a 
retrieval  algorithm  that  operates  on  a  language  with  ordinary  quantifiers.  So,  let  us  begin  by 
recapping  the  approach  to  ordinary  quantifiers. 


1  It  may  or  may  Dot  be  the  case  that  Falsef  ^Butlt[Ford ,Oldt  -Black)v 


First,  the  notion  of  substitution  was  developed  and  then  used  to  form  the  link  between 
schematic  sentences  and  their  ground  instances.  Then  the  Ground  Sequent  Retrieval  Algorithm 
was  "lifted"  to  form  the  Schematic  Sequent  Retrieval  Algorithm;  this  lifting  operation  primarily 
involved  replacing  each  test  for  equality  between  expressions  with  a  test  for  unifiability,  which, 
if  successful,  yields  a  complete  set  of  unifiers.  The  Lifting  Theorem  proved  that  the  SSRA  does 
indeed  treat  schematic  sentences  as  sentence  schemas,  that  is,  as  representatives  for  their 
ground  instances.  Thus,  it  was  established  that  the  SSRA  could  be  used  to  handle  quantified 
sentences  by  removing  their  quantifiers  and  replacing  their  quantified  variables  with  schematic 
variables. 

The  remainder  of  this  chapter  proceeds  along  the  same  lines,  except  that  instead  of  work¬ 
ing  with  ordinary  variables  it  works  with  restricted  variables,  both  quantified  and  schematic. 
First,  Section  5.4  introduces  the  notion  of  a  substitution  being  well  sorted.  Informally,  a  substi¬ 
tution  is  well  sorted  relative  to  a  sort  theory  if  it  maps  each  variable  to  a  term  that  satisfies  the 
restriction  associated  with  the  variable.  Well-sortedness  must  be  considered  relative  to  a  sort 
theory  because  it  is  the  sort  theory  that  determines  which  terms  satisfy  which  restrictions. 

Building  upon  the  notion  of  well  sortedness,  Section  5.5  examines  the  properties  of  RT. 
The  most  important  result  of  that  section,  the  Sorted  Herbrand  Theorem,  relates  the  retrieva- 
bility  of  E,akb=$q,  under  certain  circumstances,  to  the  retrievability  of  akb^sr=^q^gr,  where 
akb^3r  and  q%gr  are  the  ground  instances  of  akb  and  q  that  are  obtained  by  substitutions  that 
are  well  sorted  relative  to  E.  Hence,  sentences  with  restricted  quantifiers  can  be  treated  as 
sorted  schematic  sentences,  schematic  sentences  in  which  the  schematic  variables  have  restric¬ 
tions  associated  with  them. 

Section  5.6  then  takes  up  the  task  of  developing  a  retriever  for  sorted  schematic  sequents. 
The  algorithm  itself,  the  Sorted  Schematic  Sequent  Retrieval  Algorithm  (SSSRA),  is  identical  to 
the  SSRA  with  the  exception  that  it  uses  well  sorted  unifiers  wherever  the  SSRA  uses  arbitrary 
unifiers.  The  Sorted  Lifting  Theorem  states  that  the  SSSRA  deals  with  sorted  schematic  sen¬ 
tences  as  it  would  deal  with  the  set  of  all  well  sorted  instances  of  the  schema.  This  theorem  is 
proved  by  systematically  modifying  the  proof  of  the  Lifting  Theorem  of  Chapter  3.  With  this 
in  place,  it  is  easy  to  see  that  retrievability  of  sequents  with  restricted  quantifiers  can  be 
decided  by  removing  all  quantifiers,  replacing  restricted  quantified  variables  with  restricted 
schematic  variables,  and  handing  the  resulting  sorted  schematic  sentence  to  the  SSSRA. 
Justification  for  doing  this  is  provided  by  the  Sorted  Lifting  Theorem  and  the  Sorted  Herbrand 
Theorem. 

We  now  consider  well  sorted  substitutions  and  unifiers,  which  form  the  foundation  of  this 
approach  to  restricted  quantifiers. 


5.4.  Well  Sorted  Substitutions  and  Unifiers 

According  to  the  previously-stated  informal  definition,  a  substitution  is  well  sorted  rela¬ 
tive  to  a  sort  theory  if  it  maps  each  variable  to  a  term  that  satisfies  the  restriction  associated 
with  the  variable.  More  precisely,  a  substitution  0  is  well  sorted  relative  to  a  sort  theory  £  if, 
and  only  if,  for  every  variable  x:r,  (x:t)0  is  a  term  t  such  that  E  f=  V(r(f)). 

Two  special  cases  of  this  definition  are  worth  noting.  If  0  is  well  sorted  relative  to  E  and 
maps  x:t  to  a  ground  term  t,  then  it  must  be  that  E  Hr(f).  In  other  words,  E  must  entail  that 
t  is  of  sort  r.  If  0  maps  x:t  to  a  variable  y.w  then  it  must  be  that  E  H  Vy:w  r( t/).  That  is,  E 
must  entail  that  a;  is  a  subset  of  r. 

Expression  e'  is  said  to  be  a  well  sorted  instance  of  e  relative  to  E  if  e'  =  e0,  for  some  sub¬ 
stitution  6  that  is  well  sorted  relative  to  E.  In  the  obvious  way,  I  speak  of  well  sorted  ground 
instances  of  a  formula  and  write  e2yr  to  denote  the  set  of  all  ground  instances  of  e  that  are  well 
sorted  relative  to  E. 

Since  they  are  substitutions,  well  sorted  substitutions  enjoy  all  the  properties  possessed  by 
substitutions  in  general.  So,  for  example,  Substitution  Lemma  (3)  of  Chapter  3,  which  says 
that  composition  of  substitutions  is  associative,  trivially  holds  for  well  sorted  substitutions. 
For  other  reasons,  the  correlates  of  Substitution  Lemmas  (l)  and  (2)  hold  for  well  sorted  substi¬ 
tutions. 


Well  Sorted  Substitution  Lemmas 

(1)  The  identity  substitution,  e,  is  a  well  sorted  relative  to  any  sort  theory. 

(2)  If  a  and  0  are  well  sorted  substitutions  relative  to  E,  then  so  is  a-0. 


Proof 

(1) :  e  maps  every  variable  x:t  to  itself  and  any  E  entails  Vz:r  r{x)  since  Vx:r  t(x)  is  a  valid  sen¬ 
tence. 

(2) :  I  assume  0  and  a  are  E-well  sorted,  and  show  that  for  any  variable,  ( x :  E  (=  Vr((x:r)^cr). 
Let  ^[(yj^i),  .  .  .  ,(yn:rn)\  be  (x:t)0.  Since  0  is  E-well  sorted,  E  entails 

V  T{<P\{yvTi)>  ■  ■  •>(»»:»■.)]) 
which  normalized  is 


V  *i(iri)A  •  •  •  ArB(yJ- 

For  l<t<n  let 

•  •  •  »  k,m. 


■+ii4>\yv  ■  ■  •  >yJ) 


Since  a  is  E-well  sorted,  for  each  l<t<n,  E  entails 


V  T,{rpt) 


(5.15) 


which  normalizes  as 


By  combining  each  sentence  of  (5.16)  with  (5.15)  we  can  conclude  the  E  entails 
V(r,i,i(2i,i)A  •  •  ■  A^l  mi(zl  mi)A  '  ■  •  Ar'n  mJzlt  mJ— r[4[il>u  .  .  .  ,^.])) 
which  is  the  normalization  of 
V  t((x:t)6ct)  m 


The  closure  of  the  well  sorted  substitutions  under  pairwise  composition,  as  stated  by  the 
second  lemma  above,  is  crucial  to  the  viability  of  considering  only  well  sorted  substitutions.  In 
previous  chapters  the  most  common  way  of  obtaining  new  substitutions  is  by  composing  exist¬ 
ing  substitutions.  The  closure  under  composition  of  the  set  of  well  sorted  substitutions  is  neces¬ 
sary  if  we  are  to  compute  with  well  sorted  substitutions  in  manners  akin  to  the  way  we  com¬ 
pute  with  ordinary  substitutions. 

Section  3.2  defines  what  it  means  for  a  substitution  to  be  a  unifier,  for  one  substitution  to 
be  more  general  than  another,  for  one  set  of  substitutions  to  be  a  complete  set  of  another,  and 
for  a  set  of  substitutions  to  be  most  general.  All  of  these  notions  can  be  adapted  to  well  sorted 
substitutions  as  follows: 

Definition:  Well  Sorted  Unifier 

Let  E  be  a  set  of  expressions  and  9  be  a  substitution.  9  is  a  well  sorted  unifier  of  E  relative  to 
E  if  it  is  a  unifier  of  E  and  is  well  sorted  relative  to  E. 


Definition:  E— More  General 

Let  and  02  be  substitutions  that  are  well  sorted  relative  to  E.  01  is  E -more  general  than  02 
(written  6 j  >E  02)  iff  9^0  =  02  for  some  substitution  o  that  is  well  sorted  relative  to  E. 

Definition:  E— Complete  Set 

Let  0'  and  0  be  sets  of  substitutions  that  are  well  sorted  relative  to  E.  Then  0'  is  a  E- 
complete  set  of  0  iff: 

(1)  0'  is  correct;  if  $'  £  0*  and  9'  >E  9  then  9  £  0. 

(2)  0'  is  complete;  if  0  £  ©  then  for  some  0'  £  0',  9'  >E  9. 

Definition:  E-Most  General  Set  of  Substitutions 

A  set  of  E-substitutions  is  E-most  general  if  it  does  not  contain  two  distinct  substitutions  such 
that  one  is  E-more  general  than  the  other. 

The  Unification  Theorem  assures  us  that  any  finite  unifiable  set  of  expression  has  a  single- 
ton  MGCU .  However,  this  is  not  the  case  for  E-unifiers;  for  some  E  there  exist  E-unifiable  sets 
of  expressions  that  have  non-singleton  EA/GCf/s  (E-most  general  E-complete  sets  of  E 
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unifiers).  Here  is  an  example: 

Let  E={  Vz,y  ODD{x)AODD(y)-*EVEN{plus{x,y))  , 

Vi,y  EVEN(x)AEVEN(y)-+EVEN(plus(x,y))  } 

Let  E  ={z:EVEN ,  plus(v,w)} 

Then  O^iplus^-.EVEN^-.EVE^/z,  x:EVEN / v,  y.EVEN /w} 
and  $2  =  {plus  ( x :  ODD  ,y :  ODD  )/z ,  x :  OZ)Z)  /u,  y :  ODD  fw\ 
are  each  E-unifiers  of  E  and 
0  =  {0V$2}  is  a  E MGCU  of  E. 

Notice  that  neither  0X  nor  02  is  E-more  general  than  the  other.  Indeed,  a  finite  set  of  expres¬ 
sions  may  have  an  infinite  E MGCU .  For  instance: 

LetE={V«r(i(*)Hr(i(.(x))),  T(i(a))  } 

Let  E  ={z:T,  t'(s(y))} 

Then  {  {a/y,  t(s(a))/z:T),  {s(a)/y,  i(s(s(a)))/z:  T},  ■  •  •  }  is  a  E MGCU  of  E. 

We  now  consider  the  Sorted  Unification  Algorithm,  which  given  two  expressions  and  a  sort 
theory  determines  whether  the  expressions  are  unifiable  with  respect  to  the  sort  theory,  and,  if 
so,  returns  a  E-complete  set  of  E-unifiers  for  the  two  expressions.  The  form  of  the  algorithm  is 
similar  to  that  of  the  ordinary  Unification  Algorithm.  Both  algorithms  repeatedly  find  a  place 
where  the  two  expressions  differ  and  remove  the  difference  by  applying  a  substitution.  Not  all 
differences  between  two  expressions  can  be  removed  by  a  E-substitution,  but  those  that  can  are 
said  to  be  E— negotiable . 

The  notions  of  difference  and  negotiability  can  be  captured  by  saying  that  two  expressions, 
one  containing  the  subexpression  ex  where  the  other  contains  e2,  have  the  difference  {ei,e2}. 
The  difference  set  of  the  two  expressions  contains  all  the  differences  that  the  two  expressions 
have. 

Definition:  Difference  Set 

The  difference  set  of  expressions  E  and  E'  (written  DIFF(E ,E'))  is  defined  as: 

DIFF(E ,E')  =0  if  E  and  E'  are  the  same  expression. 

=  DIFF{eve<) y  •  •  •  \j DIFF(en,en') 

if  E  is  composed  of  constituents  e1,e2,  .  .  .  ,en  and  E1  is  composed  in  the 
same  manner  of  constituents  t\,e2,  .  .  .  ,  en\ 

=  {{E, E'}}  otherwise. 

Definition:  E-Negotiable 

A  pair  of  expressions  { el,e2 }  is  E-negotiable  iff  at  least  one  of  ev  e2  is  a  variable,  say  z:r,  and 
the  other  is  a  term,  say  t,  such  that  x:t  does  not  occur  in  t  and  E  (=  3(r(i)). 

The  Sorted  Unification  Algorithm  makes  non-deterministic  choices  of  the  same  kind  as 
those  made  by  the  GSRA  and  the  SSRA.  Each  successful  execution  path  through  the  algorithm 


results  in  the  output  of  a  single  E-unifier.  The  set  of  all  outputs  produced  by  all  execution 
paths  is  a  E-complete  set  of  E-unifiers.  If  all  execution  paths  terminate  in  FAILURE,  then  the 
input  expressions  are  not  E-unifiable. 

Sorted  Unification  Algorithm 

Input:  expressions  A  and  B  of  SFOPC,  and  E,  a  sort  theory 

Output:  SUCCESS  or  FAILURE;  If  SUCCESS,  then  a  substitution  is  also  output 

(1)  let  a  =  e 

(2)  while  Ao^Bo  do 

(3)  select  {U,V}e DIFF(A a, Bo) 

(4)  if  {U,  V}  is  negotiable  then 

(5)  let  x:t  be  whichever  of  U,V  is  a  variable  and  let  t  be  the  other 

(6)  let  0  be  any  E-complete  set  of  E-substitutions  such  that 

for  every  #G©,  £  h  Vr(t0) 

(7)  choose  X G© 

(8)  let  o  =  o-\-{t/x:rj 

(9)  else  FAIL 

(10)  SUCCEED  with  output  a 

In  step  (3)  a  difference  is  selected  from  the  difference  set  of  Ao  and  Bo.  This  is  intended 
to  be  the  same  kind  of  selection  as  considered  in  the  discussion  of  selection  functions.  Recall 
that  selections,  unlike  choices,  do  not  require  different  alternatives  to  be  considered. 

In  step  (5),  if  both  U  and  V  are  variables  then  it  does  not  matter  which  is  taken  as  x:t. 

As  previously  pointed  out,  it  is  possible  for  two  expressions  to  have  an  infinite  E MGCU. 
In  the  algorithm  this  could  arise  in  step  (6)  where  there  may  not  be  a  finite  E-complete  set  of 
E-substitutions  0  such  that  for  every  0G©,  £  HVr(ffl).  This  is  not  necessarily  a  problem  if  all 
we  want  to  do  is  determine  whether  two  expressions  are  unifiable,  or  even  if  we  want  to  com¬ 
pute  a  fixed  number  of  unifiers,  but  it  is  certainly  a  problem  if  we  want  to  compute  a  £- 
complete  set  of  £-unifiers. 

Even  if  every  possible  choice  of  0  is  a  singleton,  step  (6)  may  not  be  effectively  comput¬ 
able.  There  is  no  decision  procedure  for  determining  whether  there  is  a  0  such  that  £  F  Vr(/0) 
or  even  whether  £  f=r(f)  for  ground  t.  This  is  an  immediate  consequence  of  the  undecidability 
of  FOPC.  Observe  that  £’s  limitation  to  monadic  predicates  has  no  effect  on  the  decidability 
since  £  may  still  contain  sentences  with  arbitrary  function  signs. 

Therefore,  if  the  Sorted  Unification  Algorithm  is  to  effectively  compute  a  finite  £- 
complete  set  of  £-unifiers  then  £  must  be  such  that 

for  any  sort  expression  r,  and  any  term  t  there  is  a  finite  E-complete  set  of  E- 
substitutions  such  that  for  every  0G©,  £  HVr(/0)  and,  furthermore,  there  is  an 
effective  procedure  for  finding  that  set. 


An  important  problem,  which  is  not  addressed  in  this  thesis,  is  the  identification  of  a  set  of 
syntactic  constraints  that  guarantee  that  a  sort  theory  has  the  above  property. 

5.5.  Properties  of  RT 

To  begin  with,  RT  is  an  extension  of  RQ;  that  is,  over  the  sentences  of  FOPC  RT  agrees 
with  RQ. 

RQ-RT  Equivalence  Theorem 

A  FOPC  sequent  is  in  iff  it  is  in  . 

Proof 

if  clause:  Trivial  since  every  RQ-model  is  an  RT-model. 

only-if  clause:  If  ( M,SA )  is  an  RT-countermodel  to  a  FOPC  sequent  then  M  is  an  RQ- 
countermodel  to  that  sequent.  ■ 

Under  the  Tarskian  semantics  for  FOPC  the  rule  of  Universal  Instantiation,  which  derives 
an  instance  of  a  universally-quantified  formula,  and  the  rule  of  Existential  Generalization, 
which  derives  an  existentially-quantified  formula  from  an  instance  of  it,  are  both  sound.  The 
following  lemmas  state  a  corresponding  result  for  SFOPC,  under  both  the  RT  and  the  T  model 
theories. 

Instantiation  and  Generalization  Lemmas 

Let  M  be  any  RT  model,  xp  be  any  A-formula  of  SFOPC,  £  be  any  sort  theory,  and  6  be  any 
^-substitution.  Then: 

Instantiation  Lemma:  If  M  satisfies  Vt p  and  £,  then  M  satisfies  V(t pO). 

Generalization  Lemma:  If  M  satisfies  3(05)  and  £,  then  M  satisfies  3t p. 

Since  these  lemmas  hold  for  any  RT  model  they  also  hold  for  any  T  model. 

As  with  other  Herbrand  theorems,  the  Sorted  Herbrand  Theorem  pertains  to  sequents  in 
Skolem  Normal  Form.  Skolem  Normal  Form  for  SFOPC  sequents  is  just  as  it  is  for  FOPC 
sequents:  sentences  in  the  antecedent  must  be  in  universal  prenex  form  and  those  in  the  conse¬ 
quent  must  be  in  existential  prenex  form. 

Before  turning  to  the  Sorted  Herbrand  Theorem,  one  last  concept  must  be  introduced.  If 
(M,AS)  is  a  Herbrand  model  then  As  is  called  a  Herbrand  sort  assignment.  The  Herbrand  sort 
assignments  can  be  ordered  in  the  following  way: 

A  S‘<A  S’  iff  for  every  sort  symbol  S ,  A  S'(5)  C  A  S'(S) 

As  will  be  seen,  the  importance  of  sort  theories  that  have  a  least  Herbrand  sort  assignment 
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looms  large  in  this  section.  2 

Sorted  Herbrand  Theorem  for  RT 

Let  Y,,akb^>q  be  any  SFOPC  sequent  in  Skolem  Normal  Form  such  that  £  has  a  least  Her¬ 
brand  sort  assignment.  Then  E ,akb  q  iff  akbSgr  ?s?r- 

Proof 

if  clause:  From  the  Instantiation  and  Generalization  Lemmas  it  follows  that  if  RT  model  M 
satisfies  E[jafc6  and  falsifies  q,  then  M  satisfies  akb^gr  and  falsifies  q%gr . 

only-if  clause:  If  there  is  a  countermodel  to  akb^gr  q^gr  then  there  is  a  Herbrand  counter¬ 
model,  ( M,AS ),  in  which  As  is  the  least  Herbrand  sort  assignment  of  E.  I  show  that  { M  ,AS )  is 
also  a  countermodel  to  E ,(akbN)gr  I  first  consider  a,  an  arbitrary  sentence  in  akb 

and  show  that  { M  ,AS )  satisfies  (a^)0,  an  arbitrary  ground  instance  of  aN.  (aN)0  is  of  the  form 
T1 A  *  *  *  A Tn—*ft.  If  ft^akb^gr  then  E  MtxA  *  •  •  A Tn.  Since  As  is  a  least  Herbrand  sort 
assignment,  ( M,AS )  falsifies  T1 A  •  •  •  A Tn  and  hence  satisfies  (o^)#.  On  the  other  hand,  if 
ft^akb^gr  then  ( M,AS )  satisfies  ft  (since  it  satisfies  akb^gr)  and  therefore  also  satisfies  (aN)0.  I 
now  show  that  (M,AS)  falsifies  ( qN)0 ,  for  any  ground  substitution  6.  ( qN)0  is  of  the  form 
TjA  •  •  '  ATnAft.  If  ftfc  then  E  M  TXA  *  •  •  A Tn.  Since  is  a  least  Herbrand  assignment, 
( M,AS )  falsifies  TXA  "  *  •  A T„  and  hence  falsifies  ( qN)0 .  On  the  other  hand,  if  ftG:q^gr  then 
{ M,AS )  falsifies  ft  (since  it  falsifies  q^gt)  and  therefore  falsifies  (a^)#.  ■ 

What  happens  if  E  has  more  than  one  minimal  Herbrand  sort  assignment?  Consider  the 
knowledge  base  consisting  of  £  and  akb : 

E  =  {  BABY[Ralph)\/ DOG  (Ralph)  } 

akb  =  {Vx:DOG  Annoys(x  ,Alan),  'ix-.BABY  Annoys  (x ,  Alan) } 

Notice  that  E  has  two  minimal  Herbrand  sort  assignments;  one  that  satisfies  only 
"BABY (Ralph)"  and  another  that  satisfies  only  "DOG (Ralph)."  akbLgr  =0  because  E  does  not 
logically  imply  any  atomic  sentences.  Now,  here  is  the  problem:  £(J akb  RT-entails 
Annoys  (Ralph, Alan),  but  akb^gT  does  not. 

Reiter  (1977)  noticed  that  this  difficulty  arose  in  his  work  on  deductive  databases.  His 
solution  was  to  insist  that  E  satisfied  a  condition  called  "r— completeness" — that  for  every  sort 
symbol  5  and  every  term  t  either  E  hs(t)  or  E  ^-*5(1).  This  condition  is  equivalent  to 
requiring  that  E  has  a  unique  Herbrand  sort  assignment,  not  merely  a  unique  minimal  one. 
Though  Reiter  found  a  sufficient  condition,  it  is  grossly  over-restrictive.  What  about  the  condi¬ 
tion  that  E  must  have  a  least  Herbrand  sort  assignment?  Is  it  also  over-restrictive?  After 

1  These  least  Herbrand  sort  assignments  are  akin  to  the  least  Herbrand  models  that  are  central  to  the  theory  of 
logic  programming  In  that  literature,  least  models  are  often  called  unique  minimal  models 


presenting  a  lemma,  the  Necessity/Sufficiency  Theorem  asserts  that  the  condition  is  necessary. 
The  proof  shows  that  for  any  E  having  multiple  minimal  Herbrand  sort  assignments,  an  exam¬ 
ple  like  the  above  baby-and-dog  one  can  be  constructed. 

Least  Model  Lemma 

Let  $  be  a  set  of  universal  prenex-form  sentences.  Then  *P  has  a  least  Herbrand  model  iff  for 
any  finite  set  of  ground  atomic  formulas  >4  ,  ^  >4  implies  *P  1^-  a,  for  some  a  £  A  . 

Proof 

if  clause:  Let  PG  be  the  set  of  all  ground  instances  of  the  clausal  form  of  Furthermore,  let 
M  be  the  greatest  lower  bound  of  all  Herbrand  models  of  ty.  I  assume  that  for  any  finite  set  of 
ground  atomic  formulas  A  ,  hf.  A  implies  a,  for  some  a  6  .A ,  and  I  show  that  M  satisfies 
PG  and  therefore  if.  Let  C  =  -ia1V  •  •  •  V->anV/?iV  •  •  •  V/3m  be  an  arbitrary  clause  in  PG.  If 
some  model  of  satisfies  one  a,  then  so  does  M  and  hence  M  satisfies  C .  Otherwise  every 
model  of  ^  falsifies  every  a,  and  hence  if  H/? jV  •  •  •  V/3m.  By  the  assumption,  there  is  an  t 
such  that  'P  H/3,-,  and  M  satisfies  /?,-  and  therefore  M  satisfies  C . 

only  if  clause:  Assume  if  has  a  least  Herbrand  model  M  and  that  A  is  a  set  of  atomic  sentences 
such  that  *P  Hf-A.  Then  it  must  be  that  M  does  not  falsify  A,  and  hence  satisfies  some  c*£A. 
But  since  M  is  a  least  model,  every  Herbrand  model  of  <P  satisfies  a.  Therefore  a.  ■ 

Necessity /Sufficiency  Theorem 

Let  E  be  a  sort  theory  where  each  sentence  is  in  universal  prenex  form.  It  is  both  necessary  and 
sufficient  that  E  has  a  least  Herbrand  sort  assignment  for  the  following  statement  to  hold: 

For  every  Skolem  Normal  Form  SFOPC  sequent  of  the  form  E ,akb=$q, 

H,akb  q  iff  akb^gr  \^T  q^.gr 

Proof 

Sufficiency:  This  is  equivalent  to  the  Sorted  Herbrand  Theorem  for  RT. 

Necessity:  I  assume  that  E  does  not  have  a  least  Herbrand  sort  assignment  and  construct  an 
akb  and  q  such  that  E ,akb  q,  but  akb Ejf  q^gr-  By  the  Least  Model  Lemma,  there  is  a 

finite  set  of  ground  atomic  formulas,  A  =  {Pi(<7)/  •  •  •  such  that  s  Ha  and  for  every 

l<t<n,  EMP, ■(<,•).  Let  akb  be  the  set  of  sentences  { Vz  Pi  Q(x)  I  1<« <n }  (j  {/?(#7)  |  ! 

and  let  q  be  3z  Q(x)aR(x).  Then  akbZgr  =  I 1<«<« }  and  q^gr=q9r-  Therefore 
Hit  even  though  E,a*6  q.  ■ 

The  final  result  of  this  section  shows  that  the  retriever  specified  by  RT  can  retrieve  a  fact 
only  if  it  can  retrieve  that  fact  from  the  SKB  and  a  single  sentence  of  the  AKB.  Thus,  RT 
meets  one  of  the  objectives  that  motivated  its  definition:  it  does  not  sanction  any  chaining  other 
than  with  taxonomic  information. 


Specialized  Chaining  Theorem 

Let  E  be  a  universal  prenex  form  sort  theory  that  has  a  least  Herbrand  sort  assignment,  and  let 
Q  be  a  set  of  facts.  Then,  Y,akb  Q  only  if  for  some  a  £  akb  and  q  EQ,  E,c*  b^r  q. 

Proof 

if  E ,akb  b^r  Q 

then  akb^gr  b^y  Q  (Sorted  Herbrand  Theorem) 

then  o.kb^gr  bj^  Q  (RQ-RT  Equivalence  Theorem) 

then  a  b|<j  q  ,  for  some  a  EakbEgr  and  q  EQ  (Generalized  No-Chaining  Theorem) 

then  a  b^p  q  ,  for  some  a  E  akbEgr  and  q  £  Q  (RQ-RT  Equivalence  Theorem) 

a  is  a  well  sorted  instance  of  some  sentence  a£afc&,  and  hence  by  the  Universal  Instantiation 
Lemma  E,a  b^p  a.  Therefore,  E,a  q  for  some  aEakb.  ■ 

As  with  the  Sorted  Herbrand  Theorem,  the  correctness  of  the  Specialized  Chaining 
Theorem  requires  that  E  has  a  least  Herbrand  sort  assignment.  Otherwise,  it  is  possible  to 
chain  together  two  A-sentences  and  a  consequence  of  E.  For  example,  once  again  consider  the 
KB  consisting  of  E  and  akb : 

E  =  {  BABY(Ralph)\/DOG(Ralph)} 

akb  —{'ix:DOG  Annoy  s(x,  Alan),  VxiBABY  Annoy  s(x,  Alan)) 

Observe  that  Annoys{Ralph ,Alan)  RT-follows  from  E|joA:6  but  not  from  any  of  its  subsets. 

5.8.  Computing  Retrievability 

This  section  presents  and  proves  correct  an  algorithm  that  decides  whether  an  SFOPC 
sequent  is  in  the  RT-entailment  relation.  The  presentation  takes  place  in  two  subsections.  The 
first  develops  an  algorithm  that  solves  retrieval  problems  with  restricted  schematic  variables 
and  the  second  shows  how  retrieval  problems  with  restricted  quantified  variables  can  be  reduced 
to  retrieval  problems  of  the  first  variety. 

5.6.1.  Answering  Sorted  Schematic  Queries 

A  sorted  schematic  sentence  of  QFPC  is  a  sentence  of  QFPC  in  which  schematic  variables 
with  restrictions  may  take  the  place  of  ordinary  terms.  E ,akb=*q  is  a  sorted  schematic  sequent 
of  QFPC  iff  E  is  a  sort  theory  whose  sentences  are  in  universal  prenex  form,  akb  is  a  set  of 
sorted  schematic  sentences  of  QFPC  and  q  is  a  sorted  schematic  sentence  of  QFPC.  If 
E,akb=*q  is  a  sorted  schematic  sentence  of  QFPC,  then  q  is  retrievable  from  E,ait6  iff  for  some 
E -substitution  a,  akb^gr  qo. 
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The  Sorted  Schematic  Sequent  Retrieval  Algorithm  (SSSRA)  decides  whether  a  sorted 
schematic  CNF  sequent  of  QFPC  is  in  this  retrievability  relation.  This  algorithm  is  identical  to 
the  Schematic  Sequent  Retrieval  Algorithm  with  the  exception  that  step  (15)  computes 
EMGCUs  instead  of  ordinary  MGCUs.  Thus,  when  the  SSSRA  halts  it  indicates  SUCCESS  or 
FAILURE,  and  if  SUCCESS  is  indicated  the  algorithm  also  outputs  a  E-substitution. 

The  algorithm  yields  a  provability  relation  called  SSSRA  -provability . 

Definition:  SSSRA-provable 

Let  T,,akb=>q  be  a  sorted  schematic  CNF  sequent  of  QFPC.  Then,  q  is  SSSRA-provable  from 
E[jaA6  (written  E ,akb  |55SiM  q)  with  extracted  answer  6  iff  there  is  some  sequence  of  choices  for 
which  the  Sorted  Schematic  Sequent  Retrieval  Algorithm  halts  with  SUCCESS  and  outputs  6 
when  input  E ,akb=*q. 

The  next  two  results  establish  that  the  SSSRA  lifts  the  GSRA.  First,  the  Ground 
Equivalence  Theorem  states  that  the  SSSRA  and  the  GSRA  behave  identically  when  input  a 
ground  sequent.  Then  the  Sorted  Lifting  Theorem  states  that  computations  performed  by  the 
SSRA  on  sorted  schematic  sequents  are  themselves  schematic  for  computations  on  ground 
sequents. 

Ground  Equivalence  Theorem 

Let  kb=>q  be  a  ground  CNF  sequent  of  QFPC.  Then  kb  \ssRsq  iff  kb  q •  Moreover,  the 

Ground  Sequent  Retrieval  Algorithm  and  the  Sorted  Schematic  Sequent  Retrieval  Algorithm 
implicitly  define  isomorphic  search  spaces.  They  differ  only  in  that  every  arc  in  the  schematic 
search  space  has  an  additional  label  of  the  form  "0,  =e". 


Sorted  Lifting  Theorem 

Let  Ti,akb=iq  be  a  sorted  schematic  CNF  sequent  of  QFPC  and  a  be  a  E-ground  substitution 
for  SVARS(q).  Then  akb^r  qa  iff  for  some  E-substitution  7>£  E,afci  q  with  ex¬ 
tracted  answer  7. 


The  Ground  Equivalence  Theorem  can  be  verified  by  straightforward  examination  of  the 
GSRA  and  the  SSSRA.  The  Sorted  Lifting  Theorem  can  be  proved  by  an  argument  produced 
by  systematically  modifying  the  proof  of  the  Lifting  Theorem.  Simply  replace  all  occurrences  of 
the  words  "substitution,"  "unifier"  and  "MGCU"  with  the  words  "E-substitution",  "E-unifier" 
and  "EMGCU"  respectively.  The  resulting  argument  is  correct  because  all  properties  of  substi¬ 
tutions,  unifiers,  and  MGCUs  that  the  original  proof  relies  on  are  also  properties  (as  established 
in  Section  5.4)  of  E-substitutions,  E-unifiers,  and  EMGCUs. 


Recall  that  the  statement  of  the  SSRA  in  Section  3.4  is  more  general  than  necessary.  Step 
(15)  of  that  algorithm  allows  for  the  situation  where  two  expressions  have  a  non  singleton 
MGCU,  in  spite  of  the  fact  that  the  Unification  Algorithm  always  returns  a  singleton.  The 


justification  for  this  overgenerality  is  now  apparent.  By  allowing  for  non-singleton  MGCUs, 
the  SSRA  and  the  proof  of  its  Lifting  Theorem  could  be  transformed  trivially  into  the  SSSRA 
and  a  proof  of  its  Sorted  Lifting  Theorem. 

This  section  concludes  by  establishing  the  correctness  of  the  SSSRA. 

SSSRA  Correctness  Theorem 

Let  £,akb=*q  be  a  sorted  schematic  CNF  sequent  of  QFPC  and  T  ={7  I  E,a£6  |SSSSA  q  with  ex¬ 
tracted  answer  7}.  Then  T  is  a  E-complete  set  of  general  answers  to  T,,akb=>q. 

Proof 

Let  9  be  any  ground  E-substitution.  Then: 

9  is  an  answer  to  E  ,akb^q 

iff  akb^sr  q6  (Definition  of  Retrievability) 

iff  akb^sr  q9  (GSRA  Correctness  Theorem) 

iff  akb l555fiA  q9  (Ground  Equivalence  Theorem) 

iff  E, akb  |SS5flx  q  with  some  extracted  answer  7>E  9  (Sorted  Lifting  Theorem) 
Therefore,  if  7  is  an  extracted  answer  then  every  ground  E-substitution  9  7  for  SVARS(q)  is 

an  answer.  So  7  is  a  generalized  answer.  Going  the  other  way,  if  9  is  an  answer  then  some 
7>e  9  is  an  extracted  answer.  Hence  T  is  a  complete  set  of  generalized  answers.  ■ 

5.6.2.  Computing  Retrievability  of  SFOPC  Sequents 

This  subsection  shows  the  following  four  steps  can  be  used  to  decide  whether  an  arbitrary 
sequent  of  SFOPC  is  in  the  RT-entailment  relation: 

(1)  Transform  the  sequent  into  Skolem  Normal  Form. 

(2)  Transform  the  matrix  of  every  A-sentence  of  the  sequent  into  Conjunctive  Normal  Form. 

(3)  Remove  all  quantifiers  from  every  A-sentencc  of  the  sequent  and  replace  the  restricted 
quantified  variables  with  restricted  schematic  variables. 

(4)  The  resulting  sequent  is  a  sorted  schematic  CNF  sequent  of  QFPC.  Hand  it  over  to  the 
Sorted  Schematic  Sequent  Retrieval  Algorithm. 

Here  is  a  transformation  that  puts  any  SFOPC  sequent  into  Skolem  Normal  Form  and  a 
theorem  stating  that  the  transformation  preserves  RT-entailment. 


Definition:  Skolem  Normal  Transformation 

Let  T,,akb=>q  be  a  prenex  form  sequent  of  SFOPC.  Its  Skolem  Normal  Transform  is  computed 
by  the  following  steps. 

(1)  Put  E  into  prenex  form  via  the  usual  method  for  Tarskian  FOPC. 

(2)  While  E  contains  a  sentence  <f>  that  contains  an  existential  quantifier  do: 

Observe  that  <j>  is  of  the  form 

Vi!  Vz2  *  •  •  Vzn  3y 

for  some  n>0  and  prenex  -form  formula  t p[y}.  Choose  £,  some  n-ary  function  symbol 
that  does  not  occur  in  E,  akb  =>q  and  replace  <£’s  occurrence  in  E,  akb  => q  with 

Vxj  Vz2,  .  .  .  ,  Vz„  rl>\t{xlt  .  .  .  ,zj] 

(3)  While  akb  contains  a  sentence  <f>  that  contains  an  existential  quantifier  do: 

Observe  that  <j>  is  of  the  form 

Vz1:r1  Vz2:r2  •  •  •  Vzn:r„  3y:rv  ^[y] 

for  some  n>0  and  prenex-form  formula  Choose  £,  some  n-ary  function  symbol 

that  does  not  occur  in  E,  akb=>q  and  replace  d>’s  occurrence  in  E,  akb=zq  with 

V*l:ri  Vz2:r2,  .  .  .  ,Vz„:rn  t p[£(xv  .  .  .  ,zj] 

and  add 

Vz,,  •  •  •  zn  r1(z,)Ar2(z2)A  •  •  •  Ar„(zB)—  >■„(£(*, - ,zj) 

to  E. 

(4)  While  q  contains  a  universal  quantifier  do: 

Observe  that  q  is  of  the  form 

3z,:r,  3z2:r2  •  •  •  3zn:rn  Vy:ry  t/-[yl 

for  some  n>0  and  prenex-form  formula  ip\y\.  Choose  £,  some  n-ary  function  symbol 
that  does  not  occur  in  E,  a.<b  =>q  and  replace  q  ’s  occurrence  in  E,  akb=*q  w  ith 

3z,:r,  3z2:r2  •  •  •  3zn:rn  ^(x, - ,xj] 

and  add 

Vrl-  '  '  '  *n  7l(Xl)A'2(^2)A  '  •  ■  A T9(xn)-*Tw(Z{xlt  .  .  .  ,X„)) 

to  E. 

Skolem  Normal  Transform  Theorem 

A  generalized  sequent  of  SFOPC  is  in  [RT  iff  its  Skolem  Normal  Transform  is. 
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A  prenex  form  sentence  can  be  placed  in  CNF  by  applying  the  Conjunctive  Normal 
Transformation  to  its  matrix.3  Since  a  quantifier-free  formula  and  its  Conjunctive  Normal 
Transform  are  RP-equivalent,  they  are  also  RQ-  and  RT-equivalent.  Hence  any  A-sentence  is 
RT-equivalent  to  its  Conjunctive  Normal  Transform.  Therefore,  every  A-sentence  in  an 
SFOPC  sequent  can  be  placed  in  CNF  while  preserving  the  retrievability  of  the  sequent. 

After  applying  these  two  transformations — the  SNT  and  the  CNT —  the  retrievability  of 
the  resulting  sequent  can  be  determined  by  dropping  all  quantifiers  from  its  A-sentences, 
replacing  its  restricted  quantified  variables  by  restricted  schematic  variables,  and  handing  the 
result  over  to  the  SSSRA.  The  following  theorem  asserts  the  correctness  of  this  operation. 

SFOPC  Retrieval  Theorem 

Let  T,,akb=>q  be  an  ordinary  SNF  sequent  of  SFOPC  and  let  '£,,akb'=^ql  be  the  schematic 
sequent  that  results  from  removing  all  quantifiers  from  akb  and  q.  Then  T,,akb  q  iff 
T,,akb'\SSSRA  q' . 


Proof 

L ,  akb  (  ft  j"  q 

iff  akbZgr  q tgr 

iff  akbZgr  qYgr 

iff  for  some  qg€qZgT,  akbtgr  qg 
iff  for  some  qg' £  q' tgr,  akb'Vgr  hRQ  qg' 
iff  ^->,akb'\jSSgA  q' 


(Sorted  Herbrand  Theorem) 
(RQ-RT  Equivalence  Theorem) 
(Minuteness  Theorem) 

(Since  kbgr  =  kbgr'  and  qgr  =  qgr') 
(SSSRA  Correctness  Theorem)  ■ 


See  Section  2  5  1 


t 


Chapter  6 


Conclusions 


The  principal  contribution  of  this  thesis  has  been  the  transformation  of  retrieval  from  an 
ill-defined  unstudied  process  to  a  formally-defined  and  well-studied  one.  The  key  to  the  suc¬ 
cess  of  this  transformation  lied  in  adopting  the  viewpoint  of  knowledge  retrieval  as  a  specialized 
inference  process.  More  specifically,  knowledge  retrieval  was  argued  to  be  an  inference  process 
limited  in  such  a  way  that  it  is  fundamentally  a  pattern-matching  process,  though  it  may  be 
extended  to  do  a  bounded  amount  of  some  specialized  form  of  chaining.  This  characterization 
of  retrieval  was  formalized  by  replacing  the  intuitive  notion  of  pattern  matching  with  the  pre¬ 
cise  notion  of  no-chaining. 

The  body  of  this  thesis  was  devoted  to  formally  specifying  a  series  of  four  retrievers,  cul¬ 
minating  in  the  specification  of  one  that  fits  the  above  characterization;  it  performs  all  infer¬ 
ences  that  don’t  require  any  chaining,  all  inferences  that  involve  taxonomic  information,  and  no 
others. 


6.1.  External  Contributions 

The  logical  developments  of  this  thesis  have  been  motivated  by  and  applied  to  the  study  of 
knowledge  retrieval.  Nonetheless,  they  may  have  important  applications  outside  the  study  of 
knowledge  retrieval. 

In  addition  to  knowledge  retrievers,  many  computational  mechanisms  used  in  artificial 
intelligence  manipulate  representations.  It  is  my  working  hypothesis  that  we  can  go  a  long  way 
by  specifying  and  studying  these  mechanisms  as  inference  engines.  This  thesis  supplies  a  piece 
of  evidence  in  support  of  the  hypothesis.  The  techniques  and  results  of  this  thesis — especially 
the  model-theoretic  specification  technique  proposed  in  Chapter  1  and  used  throughout — may 
be  useful  in  studying  other  systems  that  can  be  viewed  as  incomplete  inference  engines. 

RP,  the  propositional  logic  presented  in  Chapter  2,  forms  the  basis  of  all  the  retrievers  in 
this  thesis.  This  "logic  of  no-chaining"  is  potentially  useful  as  the  basis  of  logics  used  for  other 
purposes.  One  example  is  obvious:  RP  could  form  the  basis  of  a  logic  of  explicit  belief  in  the 
same  way  that  Belnap’s  four-valued  logic  forms  the  basis  of  Levesque’s  (1984b)  and 
Lakemeyer’s  (1986)  logics  of  belief. 

Many  systems  based  on  automated  deduction — deductive  databases,  theorem  provers, 
logic-programming  systems,  etc. — are  designed  to  provide  exact  answers  to  queries.  That  is, 
variables  in  a  query  get  bound  only  once  in  a  proof.  Indeed,  it  is  crucial  that  logic  programs 
have  this  property  if  they  are  to  be  executed  by  theorem  provers  that  behave  at  all  like  inter¬ 
preters  for  traditional  programming  languages.  Chapter  4  of  this  thesis  revealed  the  Strong 
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Herbrand  Property  as  the  semantic  counterpart  of  the  exact  answer  property.  The  Strong  Her- 
brand  Property  was  shown  to  be  composed  of  the  Herbrand  Property  and  the  Minuteness  Pro¬ 
perty.  RQ,  a  model  theory  possessing  the  Minuteness  Property  was  constructed  from  RP,  a 
non-minute  model  theory.  The  same  construction  can  be  used  to  construct  other  minute  logics 
from  non-minute  ones.  Consequently,  the  analysis  of  Chapter  4  may  have  significant  and  pro¬ 
found  application  to  the  theory  of  systems  that  compute  exact  answers. 

The  part  of  this  thesis  with  the  most  scope  for  application  outside  the  domain  of  retrieval 
is  the  method  introduced  in  Chapter  5  for  transforming  an  unsorted  computational  logic  into  a 
sorted  one.  Nothing  about  the  method  was  specific  to  retrieval  per  se,  and  it  could  be  used  to 
transform  any  logical  system  that  handles  quantified  variables  by  using  unification.  Indeed,  this 
method  has  been  used  to  add  sorts  to  a  logic-programming  system  (Allen,  Giuliano,  and  Frisch, 
1983;  Frisch,  Allen  and  Giuliano;  1983)  and  to  the  design  of  a  deductive  parser  (Frisch,  1985b). 

There  are  two  reasons  why  one  may  want  to  transform  an  unsorted  logic  into  a  sorted 
one.1  One  reason  is  that  the  additional  syntactic  devices  are  useful  in  building  a  specification  of 
what  entails  what.  This  motivated  the  introduction  of  a  sorted  logic  in  Chapter  5  where  it  was 
used  to  specify  a  retriever  that  performed  certain  inferences  but  not  others.  This,  however, 
provides  no  motivation  for  adding  sorts  to  an  unsorted  Tarskian  FOPC  since  that  logic  already 
performs  the  taxonomic  inferences  in  question.  Though  the  use  of  a  sorted  logic  may  not  alter 
what  entails  what,  it  can  be  used  to  build  efficient  deductive  mechanisms.  Elsewhere  (Frisch, 
1985b),  I  have  demonstrated  that  by  introducing  sorts  into  a  logic  smaller  deductive  search 
spaces  can  be  built,  search  spaces  that  exhibit  a  minimum-commitment  search  strategy. 


6.2.  Extensions 

These  are  some  extensions  to  the  current  work  that  are  worthy  of  investigation: 

•  Develop  retrieval  algorithms  that  do  not  first  transform  sentences  into  normal  form. 

•  Redefine  RQ  and  RT  so  that  they  handle  non-prenex-form  sentences.  In  particular,  the  new 
model  theories  should  admit  the  usual  equivalences  that  allow  sentences  to  be  transformed 
into  prenex  form. 

•  Extend  the  language  of  SFOPC  so  that  sort  atoms  can  be  mixed  more  freely  into  S- 
sentences. 

•  Endow  the  retriever  with  the  ability  to  reason  about  its  own  knowledge.  Following 
Levesque’s  (1984c)  approach,  this  could  be  done  by  identifying  retrievability  with  a  modal 
operator  in  the  representation  language  instead  of  with  a  logical-implication  relation. 

•  Extend  the  retriever  to  reason  about  equality. 


This  argument  is  general  and  can  be  used  to  motivate  other  syntactic  extensions  to  a  computational  logic. 


aataffliiiiiefii^^ 


•  Specify  a  retriever  that  can  handle  both  wh-queries  and  quantifiers. 

•  Extend  SFOPC  to  allow  higher-order  restrictions  to  be  placed  on  variables  and  build  a 

retriever  that  can  reason  about  such  restrictions.  As  it  now  stands,  SFOPC  allows  restric¬ 
tions  to  be  placed  on  the  values  that  an  individual  variable  takes  on.  What  I  have  in  mind 
is  placing  restrictions  on  the  values  that  an  n-tuple  of  variables  can  take  on.  For  example, 
in  the  sentence  V(z,y):  x<y  Vy<x  the  pair  of  variables  must  take  on  pairs  of  values 

drawn  from  the  ^  relation.  A  sort  theory  then  would  need  to  include  binary  predicate  sym¬ 
bols,  such  as  'V,  and  perhaps  even  higher-order  predicate  symbols.  I  hypothesize  that  the 
techniques  developed  in  Chapter  5  would  generalize  to  this  extension  in  a  straightforward 


manner. 
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